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SERINE PROTEASES, NUCLEIC ACIDS 
5 ENCODING SERINE ENZYMES 

AND VECTORS AND HOST CELLS INCORPORATING SAME 

The present application claims priority under 35 U.S.C. §1 19, to co-pending U.S. 
10 Provisional Patent Application Serial Number 60/523,609, filed November! 9, 2003. 

FIELD OF THE INVENTION 

The present invention provides novel serine proteases, novel genetic material 
encoding these enzymes, and proteolytic proteins obtained from Micrococcineae spp., 

15 including but not limited to Cellulomonas spp. and variant proteins developed therefrom. In 
particular, the present invention provides protease compositions obtained from a 
Cellulomonas spp, DNA encoding the protease, vectors comprising the DNA encoding the 
protease, host cells transformed with the vector DNA, and an enzyme produced by the host 
cells. The present invention also provides cleaning compositions (e.g., detergent 

20 compositions), animal feed compositions, and textile and leather processing compositions 
comprising protease(s) obtained from a Micrococcineae spp., including but not limited to 
Cellulomonas spp. In alternative embodiments, the present invention provides mutant (i.e., 
variant) proteases derived from the wild-type proteases described herein. These mutant 
proteases also find use in numerous applications. 

25 

BACKGROUND OF THE INVENTION 

Serine proteases are a subgroup of carbonyl hydrolases comprising a diverse class 
of enzymes having a wide range of specificities and biological functions (See e.g., Stroud, 
Sci. Amer., 131 : 74-88). Despite their functional diversity, the catalytic machinery of serine 

30 proteases has been approached by at least two geneticajly distinct families of enzymes: 1) 
the subtilisins; and 2) the mammalian chymotrypsin-related and homologous bacterial serine 
proteases (e.g., trypsin and S. griseus trypsin). These two families of serine proteases 
show remarkably similar mechanisms of catalysis (See e.g., Kraut, Ann. Rev. Biochem., 
46:331-358 [1977]). Furthermore, although the primary structure is unrelated, the tertiary 

as structure of these two enzyme families brings together a conserved catalytic triad of amino 
acids consisting of serine, histidine and aspartate. The subtilisins and chymotrypsin-related 
serine proteases both have a catalytic triad comprising aspartate, histidine and serine. In 
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the subtilisin-related proteases the relative order of these amino acids, reading from the 
amino to carboxy terminus, is aspartate-histidine-serine. However, in the chymotrypsin- 
related proteases, the relative order is histidine-aspartate-serine. Much research has been 
conducted on the subtilisins, due largely to their usefulness in cleaning and feed 
applications. Additional work has been focused on the adverse environmental conditions 
(e.g., exposure to oxidative agents, chelating agents, extremes of temperature and/or pH) 
which can adversely impact the functionality of these enzymes in various applications. 
Nonetheless, there remains a need in the art for enzyme systems that are able to resist 
these adverse conditions and retain or have improved activity over those currently known in 
the art, 

SUMMARY OF THE INVENTION 

The present invention provides novel serine proteases, novel genetic material 
encoding these enzymes, and proteolytic proteins obtained from Micrococcineae spp., 
including but not limited to Cellulomonas spp. and variant proteins developed therefrom. In 
particular, the present invention provides protease compositions obtained from a 
Cellulomonas spp, DNA encoding the protease, vectors comprising the DNA encoding the 
protease, host cells transformed with the vector DNA, and an enzyme produced by the host 
cells. The present invention also provides cleaning compositions (e.g., detergent 
compositions), animal feed compositions, and textile and leather processing compositions 
comprising protease(s) obtained from a Micrococcineae spp., including but not limited to 
Cellulomonas spp. In alternative embodiments, the present invention provides mutant (i.e., 
variant) proteases derived from the wild-type proteases described herein. These mutant 
proteases also find use in numerous applications. 

The present invention provides isolated serine proteases obtained from a member of 
the Micrococcineae. In some embodiments, the proteases are cellulomonadins. In some 
preferred embodiments, the protease is obtained from an organism selected from the group 
consisting of Cellulomonas, Oerskovia, Cellulosimicrobium, Xylanibacterium, and 
Promicromonospora. In some particularly preferred embodiments, the protease is obtained 
from Cellulomonas 69B4. In further embodiments, the protease comprises the amino acid 
sequence set forth in SEQ ID NO:8. In additional embodiments, the present invention 
provides isolated serine proteases comprising at least 45% amino acid identity with serine 
protease comprising SEQ ID NO:8. In some embodiments, the isolated serine proteases 
comprise at least 50% identity, preferably at least 55%, more preferably at least 60%, yet 
more preferably at least 65%, even more preferably at least 70%, more preferably at least 
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75% still more preferably at least 80%. more preferably 85%, yet more preferably 90%, 
even more preferably at least 95%, and most preferably 99% identity. 

The present invention also provides compositions oomprleing isolated senne 
proteases having immunological cross-reactMty w» the serine proteases obtained from me 
Mcrococc/neae. In some preferred embodiments, the senne proteases have 
immunological cross-reactiv«y with serine protease obtained from 
alternative embodiments, the serine proteases have immunology, ~^wty w*h 
serine protease comprising the amino acid sequence set forth in SEQ ID NO:8. In M 
further embodiments, the 'serine proteases have croee-reaotivity with fragments (,.«., 
portions) of any of the eerine proteases obtained from the Micrococcineae, the 
«». 69B4 protease, and/or serine protease comprising We amino acid sequence 

set forth in SEQ ID NO:8. 

in some embodiments, the present invention provides the amino acid sequence set 
forth in SEQ ID NO:8, wherein the sequence comprises substitutions at least one ammo 
acid position selected from the group comprising positions 2, 8, 10, 11, 12, 13 14 16, 16, 
24, 26, 31. 33, 35, 36, 38. 39, 40, 43, 46, 49, 51, 54, 61, 64, 65, 67, 70, 71 76 78, 79 81, 
83 85 86 90. 93, 99, 100, 105, 107, 109, 112, 113, 116, 118. 119, 121. 123, 127, 145. 
Z 159, 160. 163, 165, 170, 174, 179, 183, 184, 185. 186, 187, and 188. In alternative 
embodiments, the sequence comprises substitutions a, leas, one amino acid poerton 
selected from the group comprising positions 1, 4, 22, 27, 28, 30, 

66 69, 75, 77, 80, 84, 87, 88, 89, 92, 96, 110, 111, 114, 115, 117, 128, 134, 144, 143, 146, 
151 154, 156, 158, 161, 166, 176, 177, 181, 182, 187, and 189. 

in some preferred embodiments, the present invention provides protease vanants 
having an amino acid sequence comprising a, leas, one substitution of an amino add made 
a, a position equivalent to a position in a Cephas 69B4 protease compnsrng , the am.no 
acid sequence set forth In SEQ ID NO:8. In alternative embodiments, me present ,nventK,n 
provides protease variants having an amino acid sequence comprising at least one 
subs«u<ion of an amino acid made a, a position equivalent to a position in a Ce»u— 
69B4 protease comprising a. leas, a portion of SEQ ID NO:8. In some *• 
substitutions are made at positions equivalent to posmons 2, 8, 10, 1112 13, 14, 15. 16 
24, 26, 31, 33, 35, 36, 38, 39, 40, 43, 46, 49, 61, 54, 61, 64, 65, 67, 70, 7^76 78, 79 81, 
83 85 86 90, 93, 99, 100, 105, 107, 109, 112, 113, 116, 118, 119, 121. 123, 127, 145, 
55^59 60 163, 165. 170, 174, 179, 183, 184, 185, 186, 187, and 188 in a CeWomona* 
6 ^4 protease having an amino add sequence set forth in SEQ ,D NO* In aKernahve 
. embodiments, me substitutions are made a. positions equivalent .0 posihons 1, 4, 22, 27, 
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28, 30, 32, 41, 47, 48, 55, 69, 63, 66, 69, 75, 77, 80, 84, 87, 88, 89, 92, 96, 110, 11, 14, 
116, 117, 128, 134, 144, ,43, 146, 151, 154, 166, 158, 161, 166, 176, 177, 18 18* _ «fc 
and 189, in a Ceto/omonas 69B4 protease having an amino acid eequence set forth ,n SEQ 
ID NO-8 in some preferred embodiments, the protease variants comprise the ammo aad 
sequence comprising SEQ ID NO:8, wherein at leas, one amino acid position 
selected from the group consisting o, 14, 16, 35, 36, 65, 75, 76, 79, 123, 127,159, and 179, 
are substituted with another amino acid. In some particularly preferred embodiments, the 
proteases comprise at least one mutation selected from the group consisting of R14L, R16I. 
B16L B16Q, R35F, T36S, G65Q, Y75G, N76L, N76V, R79T, R123L, R123Q, R127A, 
R127K R127Q, R159K, R159Q, and R179Q. In some alternative preferred embodiments, 
the proteases comprise multiple mutations selected from the group consisting of 
R16Q/R35F/R159Q, R16Q/R123L, R14UR127Q/R159Q, R14UR179Q, 
B123UR127Q/R179Q, R16CVR79T/R127Q, and R16Q/R79T. In some particularly r preferred 
embodiments, .he pmteases compose me followrng mu1a<icr,s R123L, R127Q, and R179a 
The present invention also provides protease variants having amino add sequences 
comprising at least one substitution selected from the group consisting of T36I, A3 8R . 

N73T, G77T, N24A, T36G, N24E, L69S, T36N, T36S, E119R, N74G, T36W, S76W, 
Z N24Q, T36P, S76Y. T36H, G54D, G78A, S187P, R179V, N24V, V90P, T36D 69H, 
G65P G65R, N7L, W103M, N55F, G186E, A70H, S76V, G186V, R159F, T36Y, T36V 
G65V N24M, S51A, G66Y, Q71I. V66H, P118A, T116P, A38F, N24H, VWD OTL . G177M, 
G186I, H850, Q71K, Q71G, G65S, A38D, P118F, A38S, G65T, N67G, T36R, P118R, 
S114G, Y75I, I181H, G65Q, Y75G, T36F, A38H, R179M, T183I, G78S, A64W, Y75F 
G77S N24L W103I, V3L, Q81V, R1790, G54R, T36L, Q71H/I, A70S, G49F, G54L, G54H, 
G78H, R 79,, Q81K, V90I, A38U, N67L, T109I, R179N, V66I, G78T, R179Y. I ^ 
N73S E119K V3I, Q71H, .11Q. A64H, R14E, R179T, L69V, V150L, Q71A, G65L, Q71N, 
V90S A64N I11A, N145I, H85T, A64Y, N145Q, V66L, S92G, S188M, G78D, N67A, N7S. 
V^' gTk, A70O, P1 18H, D2G, G54M, Q81H, D2Q, V66E, R79P, A38N, N145E. R179L, 
T109H R179K, V66A, G54A, G78N, T109A, R179A, N7A, R179E, H104K, A64R, and 
V80L. ' ,n further embodiments, wherein me amino acid sequence of me proteas, . vanante 
comprise a. leas, one substitution selected from the group consisting of I *SR H85U T62I, 
N67H G54I N24F, T40V. T86A, G63V, G54Q. A64F, G77Y, R35F, T129S, R61M, I126L, 
S N « R79G, T109P, R127F, R123E, P118,, T109R, 1718, T183K, N67T, P89N, 
F ^ oTgVs,, T109U G78V, A64 M , A64S, T10G, G77N, A64L, N67D, S76T. N42H, 
D^F D184R, S76,, S78R, A38K, V72,, V3T, T107S, A38V, F47I, N55Q, S76E, P118Q, 
■ T109G' Q71D, P118K, N67S, Q167R N145G, I28L, I11T, A64I, G49K, G49A, G65A, 
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N170D H85K, S185I, I181N, V80F, L69W, S76R, D184H, V150M, T183M, N67Q, S51Q, 
A38Y T107V, N145T, Q71F, A83N, S76A, N67R, T151L, T163L, S51F, Q81I. F47M, A41N, 
P118E N67Y, T107M, N73H, 67V, G63W, T10K, I181G, S187E, T107H, D2A, L142V, 
A143N A8G, S187L, V90A, G49L, N170L, G65H, T36C, G12W, S76Q, A143S, F1A, N7H, 
S185V A110T, N55K, N67F, N7I, A110S, N170A, Q81D, A64Q, Q71L, A38I, N112I. V90T, 
N146L A64T, 1118. A30S, R123I, D2H, V66M, Q71R, V90L, L68W, N24S, R159E, V66N, 
D184Q, E133Q, A64V, D2N, G13M, T40S, S76K, G177S, G63Q, S15F, ASK, A70G, and 
A38G. In some preferred embodiments, these variants have improved casein hydrolys.s 
performance as compared to wild-type Cellulomonas 69B4 protease. 

The present invention also provides protease variants having amino acid sequences 
comprising at least one substitution selected from the group consisting of R35E, R35D, 
R14E R14D, Q167E, G49C, S15R, S15H, I11W, S15C, G49Q, R35Q, R35V, G49E, 
R123D R123Y, G49H, A38D, R35S, F47R, R123C, T151L, RUT, R35T, R123E, G49A, 
G49V D56L, R35N, R35A, G12D, R35C, R123N, T46V, R123H, S155C, T121E, R127E, 
81 13C R123T, R16E, T46F, T121L, A38C, T46E, R123W, T44E, N55G, A8G, E119G, 
R35P RUG, F59W, R127S, R61E, RUS, S155W, R123F, R123S, G49N, R127D, E119Y, 
A48E N170D R159T, S99A, G12Q, P118R. F165W, R127Q, R35H, G12N, A22C, G12V, 
RIOT Y57G T100A, T46Y, R159E, E119R, T107R, T151C, G54C, E119T, R61V, I11E, 
R14I R61M, S15E, A22S, R16C, T36C, R16V, L125Q, M180L, R123Q, RUA, RUQ, 
R35M R127K, R159Q, N112P. G124D, R179E, G49L, A41D, G177D, R123V, E119V, 
T10L T109E, R179D, G12S, T10C, Q91Q, S15Y, S155Y, RUC, T163D, T121F, RUN, 
F165E, N24E, A41C, R61T. G12I, P118K, T46C, I11T, R159D, N170C, R159V, S155I, 
I11Q D2P T100aR159S.S114C.R16D, and P134R. In alternative embodiments, the 
protease variants have amino acid sequences comprising at least one substitution selected 
from the group consisting of S99G, T100K, R127A, F1P, S155V, T128A, F165H, G177E, 
A70M S140P, A87E, D2I, R159K, T36V, R179C, E1 19N, T10Y, I172A, AST. F47V. W103L. 
R61K D2V R179V, D2T, R159N, E119A, G54E, R16Q, G49S, R16I. S51L, S155E, 815M. 
R179I T10Q, G12H, R159C, R179T, T163C, R159A, A132S, N157D, G13E. L141M, A41T. 
R123M RUM, A8R, Q81P, N24T, T10D, A88F, R61Q. S99K, R179Y, T121A, N112E, 
S155T T151V, S99Q, T10E, S92T. T109K, T44C, R123A, A87C, S15F, S155F, D56F, 
T10F A83H, R179M, T121D, G13D, P118C, G49F, Q174C, S1UE, T86E, F1N, T115C, 
R127C R123K, V66N, G12Y, S113A, S15N. A175T, R79T, R123G, R179S, R179N. R123I. 
P118A' S187E, N112D. A70G. E119L, E119S, R159M, RUH, R179F, A64C, A41S, 
R179W N24G, T100Q, P118W, Q81 Q, G49K, RUL, N55A. R35K, R79V, D2M, T160D, 
A83D R179L, S51A, G12P. S99H, N42D, S188E, T10M. L125M, T116N, A70P, Q174S, 
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G65D, 8118D. E119Q. A83E, N170L, Q81A, S51C, P118Q. Q174T, I28V, S15G, and 
T1 16G. In some preferred embodiments, these variants have improved LAS stability as 
compared to wild-type Cellulomonas 69B4 protease. 

The present invention also provides protease variants having amino acid sequences 
comprising at least one substitution selected from the group consisting of G26I, G26K, 
G26Q G26V, G26W, F27V, F27W, I28P. T29E, T129W, T40D, T40Q, R43D, P43H, P43K, 
P43L A22C, T40H, P89W, G91L, S18E, F59K, A30M, A30N, G31M, C33M, G161L, G161V, 
P43N G26E, N73P, G84C, G84P, G45V, C33L, Y9E, Y9P, A147E, C158H, I28W, A48P, 
A22S' T62R, S137R, S155P, S155R, G156I, G156L, Q81A, R96C, I4D, MP. A70P, C105E, 
C105G, C105K, C105M, C105N, C105S, T128A, T128V, T128G, S140P, G12D, C33N, 
C33E T164G, G45A, G156P, S99A, Q167L, S155W, I28T, R96F, A30P, R123W, T40P, 
T39R' C105P, T100A, C105W, S155K, T46Y, R123F, I4G, S155Y, T46V, A93S, Y57N, 
Q81S, G186S, G31H, T10Y, Q31V. A83H, A38D, R123Y, R79T, C158G, G31Y, Q81P, 
R96E A30Y, R159K, A22T, T40N, Y57M, G31N, Q81G, T164L, T121E, T10F, Q146P, 
R123N V3R, P43G, Q81H, Q81D, G161I, C158M, N24T, T10W, T128S, T160I, Y176P, 
S155f't128C, L125A, P168Y, T62G, F166S, S188A, Q81F, T46W, A70G, and A38G. In 
alternative embodiments, the protease variants have amino acid sequences compris.ng at 
least one substitution selected from the group consisting of S188E, S188V, Y117K, Y117Q. 
Y117R Y117V, R127K, R127Q. R123L, T86S, R123., Q81E, L125M, H32A, S188T, N74F, 
C33D F27I, A83M, Q71Y, R123T, V90A, F59W, L141C, N170E, T46F, S51V, G162P, 
S185R A41S, R79V, T151C, T107S, T129Y, M180L, F166C, C105T, T160E, P89A, R159T, 
T183P' S188M, T10L, G25S, N24S, E119L, T107L, T107Q, G161K, G15Q, S15R, G153K, 
G153V S188G, A83E, G186P, T121D, G49A, S15C, C105Y, C105A, R127F, Q71A, T10C, 
R179K T86I, W103N, A87S, F166A, A83F, R123Q, A132C, A143H, T163I, T39V, A93D, 
V90M R123K, P134W, G177N, V1 161, S155T, T110D, G105L, N170D, T107A, G84V. 
G84M L111K, P168I, G154L, T1831, S99G, S15T, A8G, S15N, P189S, S188C, T100Q, 
A1 10G A121A, G12A, R159V, G31A, G154R, T182L, V1 15L, T160Q, T107F, R159Q, 
G144A S92T, T101S, A83R, G12HM S15H, T116Q, T36V, G154, Q81C, V130T, T183A, 
P1 18T A87E, T86M, V150N, and N24E. In some preferred embodiments, these variants 
have improved thermostability as compared to wild-type Cellulomonas 69B4 protease. 

The present invention also provides protease variants having amino acid sequences 
comprising at least one substitution selected from the group consisting of T36l, I172T, 
N24E N170Y, G77T, G186N, I181L, N73T, A38R, N74G, N24A, G54D, S76D, R123E, 
159E N112E, R35E, R179V, R123D, N24T, R179T, R14L, A38D, V90P, R14Q, R123I, 
R179D S76V, R79G, R35L, S76E, S76Y, R79D, R79P, R35Q, R179N, N112D, R179E, 
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G65P, Y75G, V90S, R179M, R35F, R123F, A64I, N24Q, R14I, R179A, R127A, R179I, 
N170D, R35A, R159F, T109E, R14D, N67D, G49A, N112Q, G78D, T121E, L69S, T116E, 
V90I, T36S, T36G, N145E, T86D, S51D, R179K, T107E, T129S, L142V, R79A, R79E, 
A38H, T107S, R123A, N55E, R123L, R159N, G65D, R14N, G65Q, R123Q, N24V, R14G, 
T116Q, A38N, R159Q, R179Y, A83E, N112L, S99N, G78A, T10N, H85Q, R35Q, N24L, 
N24H, G49S, R79L, S76T, S76L, G65S, N55F, R79V, G65T, R123N, T86E, Y75F, F1T, 
S76N, S99V, R79T, N112V, R79M, T107V, R79S, G54E, G65V, R127Q, R159D, T107H, 
H85T, R35T, T36N, Q81E, R123H, S76I, A38F, V90T, and R14T. In alternative 
embodiments, the protease variants have amino acid sequences comprising at least one 
substitution selected from the group consisting of G65L, S99D, T107M, S113T, S99T, 
G77S, R14M, A64N, R61M, A70D, Q71G, A93D, S92G, N112Y, S15W, R159K, N67G, 
T10E, R127H, A64Y, R159C, A38L, T160E, T183E, R127S, A8E, S51Q, N7L, G63D, A38S, 
R35H, R14K, T107I, G12D, A64L, S76W, A41N, R35M, A64V, A38Y, T183I, W103M, A41D, 
R127K, T36D, R61T, G65Y, G13S, R35Y, R123T, A64H, G49H, A70H, A64F, R127Y, 
R61E, A64P, T121D, V115A, R123Y, T101S, T182V, H85L, N24M, R127E, N145D, Q71H, 
S76Q, A64T, G49F, A64Q, T10D, F1D, A70G, R35W, Q71D, N121I, A64M, T36H, A8G, 
T107N, R35S, N67T, S92A, N170L, N67E, S114A, R14A, R14S, Q81D, S51H, R123S, 
A93S, R127F, 119V, T40V, S185N, R123G, R179L, S51V, T163D, T109I, A64S, V72I, 
N67S, R159S, H85M, T109G, Q71S, R61H, T107A, Q81V, V90N, T109A, A38T, N145T, 
R159A, A110S, Q81H, A48E, S51T, A64W, R159L, N67H, A93E, T116F, R61S, R123V, 
V3L, and R159Y. In some preferred embodiments, these variants have improved keratin 
hydrolysis activity as compared to wild-type Cellulomonas 69B4 protease. 

The present invention also provides protease variants having amino acid sequences 
comprising at least one substitution selected from the group consisting of T36I, P89D, 
A93T, A93S, T36N, N73T, T36G, R159F, T36S, A38R, S99W, S76W, T36P, G77T, G54D, 
R127A, R159E, H85Q, T36D, S76L, S99N, Y75G, S76Y, R127S, N24E, R127Q, D184F, 
N170Y, N24A, S76T, H85L, Y75F, S76V, L69S, R159K, R127K, G65P, N74G, R159H, 
G65Q, G186V, A48Q, T36H, N67L, R14I, R127L, T36Y, S76I, S114G, R127H, S187P, V3L, 
G78D, R123I, I181Q, R35F, H85R, R127Y, N67S, Q81P, R123F, R159N, S99A, S76D, 
A132V, R127F, A143N, S92A, N24T, R79P, S76N, R14M, G186E, N24Q, N67A, R127T, 
H85K, G65T, G65Y, R179V, Y75I, 11 1Q, A38L, T36L, R159Y, R159D, N24V, G65S, N157D, 
G186I, G54Q, N67Y, R127G, S76A, A38S, T109E, V66H, T116F, R123L, G49A, A64H, 
T36W, D184H, S99D, G161K, P134E, A64F, N67G, S99T, D2Q, S76E, R16Q, G54N, 
N67V, R35L, Q71I, N7L, N112E, L69H, N24H, G54I, R16L, N24M, A64Y, S113A, H85F, 
R79G, 111 A, T121D, R61V, and G65L In alternative embodiments, the protease variants 
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have amino acid sequences comprising at least one substitution selected from the group 
consisting of N67Q, S187Q, Q71H, T163D, R61K, R159V, Q71F, V31F, V90I, R79D, 
T160E, R123Q, A38Y, S113G, A88F, A70G, 11 1T, G78A, N24L, S92G, R14L, D184R, 
G54L, N112L, H85Y, R16N, G77S, R179T, V80L, G65V, T121E, Q71D, R16G, P89N, 
N42H, G49F, 11 1S, R61M, R159C, G65R, T183I, A93D, L111E, S51Q, G78N, N67T, A38N, 
T40V, A64W, R159L, T10E, R179K, R123E, V90P, A64N, G161E, H85T, A8G, L142V, 
A41N, S185I, Q71L, A64T, R16I, A38D, G54M, N112Q, R16A, R14E, V80H, N170D, S99G, 
R179N, S15E, G49H, A70P, A64S, G54A, S185W, R61H, T10Q, A38F, N170L, T10L, 
N67F, G12D, D184T, R14N, S187E, R14P, N112D, S140A, N112G G49S, L111D, N67M, 
V150L, G12Y, R123K, P89V, V66D, G77N, S51T, A8D, I181H, T86N, R179D, N55F, N24S, 
D184L, R61S, N67K, G186L, F1T, R159A, 11 1L, R61T, D184Q, A93E, Q71T, R179E, 
L69W, T163I, S188Q, L125V, A38V, R35A, P134G, A64V, N145D, V90T, and A143S. In 
some preferred embodiments, these variants have improved BMI performance as compared 
to wild-type Cellulomonas 69B4 protease. 

The present invention also provides protease variants having amino acid sequences 
comprising at least one substitution selected from the group consisting of T36I, N170Y, 
A38R, R79P, G77T, L69S, N73T, S76V, S76Y, R179V, T36N, N55F, R159F, G54D, G65P, 
L69H, T36G, G177M, N24E, N74G, R159E, T36S, Y75G, S76I, S76D, A8R, A24A, V90P, 
R159C, G65Q, T121E, A8V, S76L, T109E, R179M, A8T, T107N, G186E, S76W, R123E, 
A38F, T36P, N67G, Y75F, S76N, R179I, S187P, N67V, V90S, R127A, R179Y, R35F, 
N145S, G65S, R61M, S51A, R179N, R123D, N24T, N55E, R79C, G186V, R123I, G161E, 
G65Y, A38S, R14L, V90I, R79G, N145E, N67L, R127S, R150Y, M180D, N67T, A93D, 
T121D, Q81V, T109I, A93E; T107S, R179T, R179L, R179K, R159D, R179A, R79E, R123F, 
R79D, T36D, A64N, L142V, T109A, 1172V, A83N, T85A, R179D, A38L, I126L, R127Q, 
R127L, L69W, R127K, G65T, R127H, P134A, N67D, R14M, N24Q, A143N, N55S, N67M., 
S51D, S76E, T163D, A38D, R159K, T183I, G63V, A8S, T107M, H85Q, N112E, N67F, 
N67S, A64H, T86I, P134E, T182V, N67Y, A64S, G78D, V90T, R61T, R16Q, G65R, T86L, 
V90N, R159Q, G54I, S76C, R179E, V66D, L69V, R127Y, R35L, R14E, and T86F. In 
alternative embodiments, the protease variants have amino acid sequences comprising at 
least one substitution selected from the group consisting of G186I, A64Q, T109G, G64L, 
N24L, A8E, N112D, A38H, R179W, S114G, R123L, A8L, T129S, N170D, R159N, N67C, 
S92C, T107A, G54E, T107E, T36V, R127T, A8N, H85L, A110S, N170C, A64R, A132V, 
T36Y, G63D, W103M, T151V, R123P, W103Y, S76T, S187T, R127F, N67A, P171M, A70S, 
R159H, S76Q, L125V, G54Q, G49L, R14I, R14Q, A83I, V90L, T183E, R159A, T101S, 
G65D, G54A, T107Q, Q71M, T86E, N24M, N55Q, R61V, P134D, R96K, A88F, N145Q, 
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A64M A64T, N24V, S140A, A8H, A64I, R123Q, T183Q, N24H, A64W, T62I, T129G, R35A, 
T40V , |11T ) A38N, N145G, A175T, G77Q, T109H, A8P, R35E, T109N, A110T, N67Q, 
G63P, H85R, S92G, A175V, S51Q, G63Q, T116F, G65A, R79L, N145P, L69Q, Q146D 
A83D, F166Y, R123A, T121L, R123H, A70P, T182W, S76A, A64F, T107H ^QBU 
R123K A64L, N67R, V3L, S187E, S161K, T86M, I4M, G77N, G49A, A41N, G54M, T107V, 
Q81E A38I, T109L, T183K, A70G, Q71D, T183L, Q81H, A64V, A93Q, S188E, 861 F. 
G186P G186T > R159L,P134G ) N145T,N55V ) V66E,R159V I Y176L ) andR16L. . In some 
preferred embodiments, these variants have improved BMl performance under low pH 
conditions, as compared to wild-type Cellulomonas 69B4 protease. 

The present invention also provides serine proteases comprising at least a port»on 
of an amino acid sequence selected from the group consisting of SEQ ID NO:8, SEQ ID 
NO-6 SEQ ID NO:7, and SEQ ID NO:9. In some embodiments, the nucleotide sequences 
encoding these serine proteases comprise a nucleotide sequence selected ^^roup 
consisting of SEQ ID NO:1, SEQ ID NO:2, SEQ ID NO:3, SEQ ID NO:4, and SEQ ID NO:5 
in some embodiments, the serine proteases are variants having amino acid sequences that 
are similar to that set forth in SEQ ID NO:8. In some preferred embodiments, the proteases 
are obtained from a member of the Micrococcineae. In some particularly preferred 
embodiments, the proteases are obtained from an organism selected from the group 
consisting of Cellulomonas, Oerskovia, Cellulosimicrobium, Xylanibactenum, and 
Promicromonospora. In some particuiarly preferred embodiments, the protease is obtamed 

from variants of Cellulomonas 69B4. 

The present invention also provides isolated protease variants having amino acid 
sequences comprising at least one substitution of an amino acid made at a posit.on 
equivalent to a position in a Cellulomonas 69B4 protease comprising the ammo acd 
sequence set forth in SEQ ID NO:8, wherein the amino acid of the protease compnses 
Arg14 Ser15, Arg16, Cys17, His32, Cys33. Phe52, Asp56, ThrTOO, Val115, Thr116, 
Tyr117 Pro118, Glu119, Ala132, Glu133, Pro134, Gly135, Asp136, Ser137, Thr151, 
Ser152, G.y153, Gly154, Ser155, G.y156, Asn157, Thr164, and Phe165. In some 
embodiments, the catalytic triad of the proteases comprises His 32, Asp56, and Ser137 In 
a.ternative embodiments, the proteases comprise Cys131, A.a132, G.u133, Pro134, G.y135, 
Thr151 Ser152, Gly153, Gly154, Ser155, Gly156, Asn157 and Gly 162, Thr 163, and 
Thr164 In some preferred embodiments, the amino acid sequence of the proteases 
comprise Phe52, Tyr11 7l Pro118 and Glu119. In some particularly preferred embodiments, 
the amino acids sequences of the proteases have main-chain to main-chain hydrogen 
, bonding from Gly 154 to the substrate main-chain. 
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,„ embodiments, me proteases of the present invention oomprise three disuthde 
bonds, tn some preferred embodiment,, the disu««e bonds are iocated I between ^CITand 
C38, C95 and Cf05, and 0131 and C158. In some particularly 

disuifide bonds are located between C17 and C38, 095 and 01 06, and 0131 and 01580, 
SEQ ID NO:8. In alternative profease variant embodiments, the disulfide bonds are located 
at positions equivalent to the disulfide bonds in SEQ ID NO:8. 

The present invention also provides isolated protease variants having am.no acd 
sequences comprising at least one substitution of an amino acid made at a poston 
equivalent to a position in a Cellulomonas 69B4 protease comprising the ant.no acd 
ZTnce se, forth in SEQ ID NO:8, wherein me variants have altered substrate species 
as compared to wlld-type Cellulon.onas 69B4 protease. In some further preyed 
embedments, the variant have altered pis as compared to wlld-type CM*""" .«* 
protease. In addWonal preferred embodiments, the variants have improved «Mft ^as 
compared to wild-type Cellulomonas 69B4 protease. In still further preferred embod,ments, 

to. variants exhibit aKered surlace properties as compared to ««. Catenas 69B4 
protease. In additional particularly preferred embodiment .he variants compr.se mu« one 
e, least one substitution at sites selected from the group consisting of 1 , 2, 4 7, 8 10 11, 
12 13 14, 15, 16, 22. 24, 25, 32, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 
49 0 1 62 53, 64, 55, 57, 59, 61, 62, 63, 64. 65, 66, 67, 68, 69, 71, 73, 74, 75 7*77, 
^ 79 80 81, 83, 84, 85, 86, 87. 88, 89, 90, 91. 92, 93, 95, 99, 100. 101 1«. 1 ■ 
,05, ,06, ,07, ,08, ,09, 110, 111, 112, 113. 114, 115, 116, 117, 18, « 20, 121, 23, 
,24 126 ,27 ,28. ,30, ,31, 132, 133, 134, 135, 137, 143, 144, ,45, ,46, ,47, ,48, ,52, 
,53 Z, 155, 156, 157, 168, 159, 160, 161, ,62, ,63, ,64, ,66, ,66, ,67. ,68, ,70. ,7,, 
,73 174, 176, 176, 177, 178, 179, ,80, ,81, ,82, ,83, and ,84. 

' The present invention also provides protease variants having a. leas, one .mproved 
property as compared ,o the wild-type protease. In some particularly preferred 
alien*. ,he variants are variante o, a serine protease chained from a member o, , he 
Mlcrococcineae. In some partly preferred embodiments, the proteases are ob«ed 
from an organism selected from the group consisting o, CeHuiomonas, Oe«*cma, 
MMMW Xyfen/bacenum, and Prodrome™. In some part,cu,arty 
preferred embodiment me protease is obtained from vartants of CeMomonas 69B4. In 
ote preferred embodiments, a, leas, one improved property is seleCed from ft. group 
conslg o, acid stabilKy. thermostat^, casein hydrolysis, keratin hydrolys,s, cl.an.ng 
s performance, and LAS stability. 
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The present invention also provides expression vectors comprising a polynucleotide 
sequence encoding protease variants having amino acid sequences comprising at least one 
substitution of an amino acid made at a position equivalent to a position in a Cellulomonas 
69B4 protease comprising the amino acid sequence set forth in SEQ ID NO:8. In further 
embodiments, the present invention provides host cells comprising these expression 
vectors. In some particularly preferred embodiments, the host cells are selected from the 
group consisting of Bacillus sp., Streptomyces sp., Aspergillus sp,, and Trichoderma sp. 
The present invention also provides the serine proteases produced by the host cells. 

The present invention also provides variant proteases comprising an amino acid 
sequence selected from the group consisting of SEQ ID NOS:54, 56, 58, 60, 62, 64, 66, 68, 
70, 72, 74, 76, and 78, In some preferred embodiments, the amino acid sequence is 
encoded by a polynucleotide sequence selected from the group consisting of SEQ ID 
NOS:53, 55, 57, 59, 61, 63, 65, 67, 69, 71, 73, 75, and 77. In further embodiments, the 
present invention provides expression vectors comprising a polynucleotide sequence 
encoding at least one protease variant. In additional embodiments, the present invention 
provides host cells comprising these expression vectors. In some particularly preferred 
embodiments, the host cells are selected from the group consisting of Bacillus sp., 
Streptomyces sp., Aspergillus sp., and Trichoderma sp. The present invention also provides 
the serine proteases produced by the host cells. 

The present invention also provides compositions comprising at least a portion of an 
isolated serine protease of obtained from a member of the Micrococcineae, wherein the 
protease is encoded by a polynucleotide sequence selected from the group consisting of 
SEQ ID NO:1, SEQ ID NO:2, SEQ ID NO:3, and SEQ ID NO:4. In some preferred 
embodiments, the sequence comprises at least a portion of SEQ ID NO:1 . In further 
embodiments, the present invention provides host cells comprising these expression 
vectors. In some particularly preferred embodiments, the host cells are selected from the 
group consisting of Bacillus sp., Streptomyces sp., Aspergillus sp., and Trichoderma sp. 
The present invention also provides the serine proteases produced by the host cells. 

The present invention also provides variant serine proteases, wherein the proteases 
comprise at least one substitution corresponding to the amino acid positions in SEQ ID 
NO:8, and wherein variant proteases have better performance in at least one property 
selected from the group consisting of keratin hydrolysis, thermostability, casein activity, LAS 
stability, and cleaning, as compared to wild-type Cellulomonas 69B4 protease. 

The present invention also provides isolated polynucleotides comprising a nucleotide 
sequence (i) having at least 70% identity to SEQ ID NO:4, or (ii) being capable of hybridizing 
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to a probe derived from the nucleotide sequence set forth in SEQ ID NO:4, under conditions 
of intermediate to high stringency, or (iii) being complementary to the nucleotide sequence 
set forth in SEQ ID NO:4. In embodiments, the present invention provides expression 
vectors encoding at least one such polynucleotide. In further embodiments, the present 
invention provides host cells comprising these expression vectors. In some particularly 
preferred embodiments, the host cells are selected from the group consisting of Bacillus sp., 
Streptomyces sp., Aspergillus sp., and Trichoderma sp. The present invention also provides 
the serine proteases produced by the host cells. In further embodiments, the present 
invention provides polynucleotides that are complementary to at least a portion of the 
sequence, set forth in SEQ ID NO:4. 

The present invention also provides methods of producing an enzyme having 
protease activity, comprising: transforming a host cell with an expression vector comprising 
a polynucleotide having at least 70% sequence identity to SEQ ID NO:4; cultivating the 
transformed host cell under conditions suitable for host cell. In some embodiments, the host 
cell is selected from the group consisting of Streptomyces, Aspergillus, Trichoderma and 
Bacillus species. 

The present invention also provides probes comprising 4 to 150 nucleotide sequence 
substantially identical to a corresponding fragment of SEQ ID NO:4, wherein the probe is 
used to detect a nucleic acid sequence coding for an enzyme having proteolytic activity, and 
wherein the nucleic acid sequence is obtained from a member of the Micrococcineae. In 
some embodiments, the Micrococcineae is a Cellulomonas spp. In some preferred 
embodiments, the Cellulomonas is Cellulomonas strain 69B4. 

The present invention also provides cleaning compositions comprising at least one 
serine protease obtained from a member of the Micrococcineae. In some embodiments, ate 
least one protease is obtained from an organism selected from the group consisting of 
Cellulomonas, Oerskovia, Cellulosimicrobium, Xylanibacterium, and Promicromonospora. In 
some preferred embodiments, the protease is obtained from Cellulomonas 69B4. In some 
particularly preferred embodiments, at least one protease comprises the amino acid 
sequence set forth in SEQ ID NO:8. In some further embodiments, the present invention 
provides isolated serine proteases comprising at least 45% amino acid identity with serine 
protease comprising SEQ ID NO:8. In some embodiments, the isolated serine proteases 
comprise at least 50% identity, preferably at least 55%, more preferably at least 60%, yet 
more preferably at least 65%, even more preferably at least 70%, more preferably at least 
75%, still more preferably at least 80%, more preferably 85%, yet more preferably 90%, 
even more preferably at least 95%, and most preferably 99% identity. 75. 
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The present invention further provides cleaning compositions comprising at least one 
serine protease, wherein at least one of the serine proteases has immunological cross- 
reactivity with the serine protease obtained from a member of the Micrococcineae. In some 
preferred embodiments, the serine proteases have immunological cross-reactivity with 
serine protease obtained from Cellulomonas 69B4. In alternative embodiments, the serine 
proteases have immunological cross-reactivity with serine protease comprising the amino 
acid sequence set forth in SEQ ID NO:8. In still further embodiments, the serine proteases 
have cross-reactivity with fragments (i.e., portions) of any of the serine proteases obtained 
from the Micrococcineae, the Cellulomonas 69B4 protease, and/or serine protease 
comprising the amino acid sequence set forth in SEQ ID NO:8. 

The present invention further provides cleaning compositions comprising at least one 
serine protease, wherein the protease is a variant protease having an amino acid sequence 
comprising at least one substitution of an amino acid made at a position equivalent to a 
position in a Cellulomonas 69B4 protease having an amino acid sequence set forth in SEQ 
ID NO:8. In some embodiments, the substitutions are made at positions equivalent to 
positions 2, 8, 10, 11, 12, 13, 14, 15, 16, 24, 26, 31, 33, 35, 36, 38, 39, 40, 43, 46, 49, 51, 
54, 61, 64, 65, 67, 70, 71, 76, 78, 79, 81, 83, 85, 86, 90, 93, 99, 100, 105, 107, 109, 112, 
113, 116, 118, 119, 121, 123, 127, 145, 155, 159, 160, 163, 165, 170, 174, 179, 183, 184, 
185, 186, 187, and 188 in a Cellulomonas 69B4 protease comprising an amino acid 
sequence set forth in SEQ ID NO:8. In alternative embodiments, the substitutions are made 
at positions equivalent to positions 1, 4, 22, 27, 28, 30, 32, 41, 47, 48, 55, 59, 63, 66, 69, 75, 
77, 80, 84, 87, 88, 89, 92, 96, 110, 111, 114, 115, 117, 128, 134, 144, 143, 146, 151, 154, 
156, 158, 161, 166, 176, 177, 181, 182, 187, and 189, in a Cellulomonas 69B4 protease 
comprising an amino acid sequence set forth in SEQ ID NO:8. In further embodiments, the 
protease comprises at least one amino acid substitutions at positions 14, 16, 35, 36, 65, 75, 
76, 79, 123, 127, 159, and 179, in an equivalent amino acid sequence to that set forth in 
SEQ ID NO:8. In still further embodiments, the protease comprises at least one mutation 
selected from the group consisting of R14L, R16I, R16L, R16Q, R35F, T36S, G65Q, Y75G, 
N76L, N76V, R79T, R123L, R123Q, R127A, R127K, R127Q, R159K, R159Q, and R179Q. 
In yet additional embodiments, the protease comprises a set of mutations selected from the 
group consisting of the sets R16Q/R35F/R159Q, R16Q/R123L, R14L/R127Q/R159Q, 
R14L/R179Q, R123L/R127Q/R179Q, R16Q/R79T/R127Q, and R16Q/R79T. In some 
particularly preferred embodiments, the protease comprises the following mutations R123L, 
R127Q, and R179Q. In some particularly preferred embodiments, the variant serine 
proteases comprise at least one substitution corresponding to the amino acid positions in 
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SEQ ID NO:8, and wherein the variant proteases have better performance in at least one 
property selected from the group consisting of keratin hydrolysis, thermostability, casein 
activity, LAS stability, and cleaning, as compared to wild-type Cellulomonas 69B4 protease. 
In some embodiments, the variant protease comprises an amino acid sequence selected 
from the group consisting of SEQ ID NOS:54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, and 
78, In alternative embodiments, the variant protease amino acid sequence is encoded by a 
polynucleotide sequence selected from the group consisting of SEQ ID NOS:53, 55, 57, 59, 
61 , 63, 65, 67, 69, 71 , 73, 75, and 77. 

The present invention also provides cleaning compositions comprising a cleaning 
effective amount of a proteolytic enzyme, the enzyme comprising an amino acid sequence 
having at least 70 % sequence identity to SEQ ID NO:4, and a suitable cleaning formulation. 
In some preferred embodiments, the cleaning compositions further comprise one or more 
additional enzymes or enzyme derivatives selected from the group consisting of proteases, 
amylases, lipases, mannanases, pectinases, cutinases, oxidoreductases, hemicellulases, 
and cellulases. 

The present invention also provides compositions comprising at least one serine 
protease obtained from a member of the Micrococcineae, wherein the compositions further 
comprise at least one stabilizer. In some embodiments, the stabilizer is selected from the 
group consisting of borax and glycerol. In some embodiments, the present invention 
provides competitive inhibitors suitable to stabilize the enzyme of the present invention to 
anionic surfactants. In some embodiments, at least one protease is obtained from an 
organism selected from the group consisting of Cellulomonas, Oerskovia, 
Cellulosimicrobium, Xylanibacterium, and Promicromonospora. In some preferred 
embodiments, the protease is obtained from Cellulomonas 69B4. In some particularly 
preferred embodiments, at least one protease comprises the amino acid sequence set forth 
in SEQ ID NO:8. 

The present invention further provides compositions comprising at least one serine 
protease obtained obtained from a member of the Micrococcineae, wherein the serine 
protease is an autolytically stable variant. In some embodiments, at least one variant 
protease is obtained from an organism selected from the group consisting of Cellulomonas, 
Oerskovia, Cellulosimicrobium, Xylanibacterium, and Promicromonospora. In some 
preferred embodiments, the variant protease is obtained from Cellulomonas 69B4. In some 
particularly preferred embodiments, at least one variant protease comprises the amino acid 
sequence set forth in SEQ ID NO:8. 

The present invention also provides cleaning compositions comprising at least 
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0 0001 weight percent of the serine protease of the present invention, and optionally, an 
adjunct ingredient. In some embodiments, the composition comprises an adjunct ingred.ent. 
in some preferred embodiments, the composition comprises a sufficient amount of a pH 
modifier to provide the composition with a neat pH of from about 3 to about 5, the 
composition being essentially free of materials that.hydrolyze at a pH of from about 3 to 
about 5 in some particularly preferred embodiments, the materials that hydrolyze compnse 
a surfactant material. In additional embodiments, the cleaning composition is a l.qu.d 
composition. In further embodiments, the surfactant material comprises a sod.um alkyl 
sulfate surfactant that comprises an ethylene oxide moiety. 

The present invention additionally provides cleaning compositions that compnse at 
least one acid stable enzyme, the cleaning composition comprising a sufficient amount of a 
pH modifier to provide the composition with a neat pH of from about 3 to about 5, the 
composition being essentially free of materials that hydrolyze at a pH of from about 3 to 
about 5. In further embodiments, the materials that hydrolyze comprise a surfactant 
material. In some preferred embodiments, the cleaning composition being a liquid 
composition. In yet additional embodiments, the surfactant material comprises a sod.um 
alkyl sulfate surfactant that comprises an ethylene oxide moiety. In some embod.ments, 
the cleaning composition comprises a suitable adjunct ingredient. In some add.t.onal 
embodiments, the composition comprises a suitable adjunct ingredient. In some preferred 
embodiments, the composition comprises from about 0.001 to about 0.5 weight % of ASP. 

in some alternatively preferred embodiments, the composition comprises from about 0.01 to 

about 0.1 weight percent of ASP. 

The present invention also provides methods of cleaning, the comprising the steps 
of a) contacting a surface and/or an article comprising a fabric with the cleaning 
composition comprising the serine protease of the present invention at an appropriate 
concentration; and b) optionally washing and/or rinsing the surface or material. In 
alternative embodiments, any suitable composition provided herein finds use .n these 
methods. 

The present invention also provides animal feed comprising at least one senne 
protease obtained from a member of the Micrococcineae. In some embodiments, at least 
one protease is obtained from an organism selected from the group consisting of 
Cellulomonas, Oerskovia, Cellulosimicrobium.Xylanibacterium, and Promicromonospora. In 
some preferred embodiments, the protease is obtained from Cellulomonas 69B4. In some 
particularly preferred embodiments, at least one protease comprises the ammo acd 
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sequence set forth in SEQ ID NO:8. 

The present invention provides an isolated polypeptide having proteolytic activity, 
(e.g., a protease) having the amino acid sequence set forth in SEQ ID NO:8. In some 
embodiments, the present invention provides isolated polypeptides having approximately 
40% to 98% identity with the sequence set forth in SEQ ID NO:8. In some preferred 
embodiments, the polypeptides have approximately 50% to 95% identity with the sequence 
set forth in SEQ ID NO:8. In some additional preferred embodiments, the polypeptides have 
approximately 60% to 90% identity with the sequence set forth in SEQ ID NO:8. In yet 
additional embodiments, the polypeptides have approximately 65% to 85% identity with the 
sequence set forth in SEQ ID NO:8. In some particularly preferred embodiments, the 
polypeptides have approximately 90% to 95% identity with the sequence set forth in SEQ ID 
NO:8. 

The present invention further provides proteases obtained from bacteria of the 
suborder Micrococcineae. In some preferred embodiments, the proteases are obtained 
from members of the family Promicromonosporaceae. In yet further embodiments, the 
proteases are obtained from any member of the genera Xylanimicrobium, Xylanibacterium, 
Xylanimonas, Myceligenerans, and Promicromonospora. In some preferred embodiments, 
the proteases are obtained from members of the family Cellulomonadaceae. In Some 
particularly preferred embodiments, the proteases are obtained from members of the genera 
Cellulomonas and Oerskovia. In some further preferred embodiments, the proteases are 
derived from Cellulomonas spp. In some embodiments, the Cellulomonas spp. is selected 
from Cellulomonas fimi, Cellulomonas biazotea, Cellulomonas cellasea, Cellulomonas 
hominis, Cellulomonas flavigena, Cellulomonas persica, Cellulomonas iranensis, 
Cellulomonas gelida, Cellulomonas humilata, Cellulomonas turbata, Cellulomonas uda, 
Cellulomonas fermentans, Cellulomonas xylanllytica, Cellulomonas humilata and 
Cellulomonas strain 69B4 (DSM 16035). 

In alternative embodiments, the proteases are derived from Oerskovia spp. In some 
preferred embodiments, the Oerskovia spp. is selected from Oerskovia jenensis, Oerskovia 
paurometabola, Oerskovia enterophila, Oerskovia turbata and Oerskovia turbata strain DSM 
20577. 

In some embodiments, the proteases have apparent molecular weights of about 
1 7kD to 21 kD as determined by a matrix assisted laser desorption/ionizatori - time of flight 
("MALDI-TOF") spectrophotometer. 

The present invention further provides isolated polynucleotides that encode 
proteases comprise an amino acid sequence comprising at least 40% amino acid sequence 
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identity to SEQ ID NO:8. In some embodiments, the proteases have at least 50% amino 
acid sequence identity to SEQ ID NO:8. In some embodiments, the proteases have at least 
60% amino acid sequence identity to SEQ ID NO:8. In some embodiments, the proteases 
have at least 70% amino acid sequence identity to SEQ ID NO:8. In some embodiments, 
the proteases have at least 80% amino acid sequence identity to SEQ ID NO:8. In some 
embodiments, the proteases have at least 90% amino acid sequence identity to SEQ ID 
NO:8. In some embodiments, the proteases have at least 95% amino acid sequence 
identity to SEQ ID NO:8. The present invention also provides expression vectors comprising 
any of the polynucleotides provided above. 

The present invention further provides host cells transformed with the expression 
vectors of the present invention, such that at least one protease is expressed by the host 
cells. In some embodiments, the host cells are bacteria, while in other embodiments, the 
host cells are fungi. In some preferred embodiments, the bacterial host cells are selected 
from the group consisting of the genera Bacillus and Streptomyces. In some alternative 
preferred embodiments, the fungal host cells are members of the genus Trichoderma, while 
in other alternative preferred embodiments, the fungal host cells are members of the genus 
Aspergillus. 

The present invention also provides isolated polynucleotides comprising a nucleotide 
sequence (i) having at least 70% identity to SEQ ID NOS:3 or 4, or (ii) being capable of 
hybridizing to a probe derived from the nucleotide sequence disclosed in SEQ ID NOS: 3 or 
4, under conditions of medium to high stringency, or (iii) being complementary to the 
nucleotide sequence disclosed in SEQ ID NOS:3 or 4. In some embodiments, the present 
invention provides vectors comprising such polynucleotide. In further embodiments, the 
present invention provides host cells transformed with such vector. 

The present invention further provides methods for producing at least one enzyme 
having protease activity, comprising: the steps of transforming a host cell with an expression 
vector comprising a polynucleotide comprising at least 70% sequence identity to SEQ ID 
NO:4, cultivating the transformed host cell under conditions suitable for the host cell to 
produce the protease; and recovering the protease. In some preferred embodiments, the 
host cell is a Streptomyces spp, while in other embodiments, the host cell is a Bacillus spp,, 
a Trichoderma spp., and/or a Aspergillus spp. In some embodiments, the Streptomyces 
spp. is Streptomyces lividans. In alternative embodiments, the host cell is T. reeseL In 
further embodiments, the Aspergillus spp. is A. niger. 

The present invention also provides fragments (i.e., portions) of the DNA encoding 
the proteases provided herein. These fragments find use in obtaining partial length DNA 
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fragments capable of being used to isolate or identify polynucleotides encoding mature 
protease enzyme described herein from Cellulomonas 69B4, or a segment thereof having . 
proteolytic activity. In some embodiments, portions of the DNA provided in SEQ ID NO:1 
find use in obtaining homologous fragments of DNA from other species, and particularly 
from Micrococcineae spp. which encode a protease or portion thereof having proteose 
activity. 

The present invention further provides at least one probe comprising a 
polynucleotide substantially identical to a fragment of SEQ ID NOS:1 , 2, 3 or 4, wherein the 
probe is used to detect a nucleic acid sequence coding for an enzyme having proteolytic 
activity, and wherein the nucleic acid sequence is obtained from a bacterial source. In some 
embodiments, the bacterial source is a Cellulomonas spp. In some preferred embodiments, 
the bacterial source is Cellulomonas strain 69B4. 

The present invention further provides compositions comprising at least one of the 
proteases provided herein. In some preferred embodiments, the compositions are cleaning 
compositions. In some embodiments, the present invention provides cleaning compositions 
comprising a cleaning effective amount of at least one protease comprising an amino acid 
sequence having at least 40% sequence identity to SEQ ID NO:8, at least 90% sequence 
identity to SEQ ID NO:8, and/or having an amino acid sequence of SEQ ID NO:8. In some 
embodiments, the cleaning compositions further comprise at least one suitable cleaning 
adjunct. In some embodiments, the protease is derived from a Cellulomonas sp. In some 
preferred embodiments, the Cellulomonas spp. is selected from Cellulomonas fimi, 
Cellulomonas biazotea, Cellulomonas cellasea, Cellulomonas hominis, Cellulomonas 
flavigena, Cellulomonas persica, Cellulomonas iranensis, Cellulomonas gelida, 
Cellulomonas humilata, Cellulomonas turbata, Cellulomonas uda, and Cellulomonas strain 
69B4 (DSM 16035). In some particularly preferred embodiments, the Cellulomonas spp is 
Cellulomonas. strain 69B4. In still further embodiments, the cleaning composition further 
comprises at least one additional enzymes or enzyme derivatives selected from the group 
consisting of protease, amylase, lipase, mannanase and cellulase. 

The present invention also provides isolated naturally occurring proteases 
comprising an amino acid sequence having at least 45% sequence identity to SEQ ID NO:8, 
at least 60% sequence identity to SEQ ID NO:8, at least 75% sequence identity to SEQ ID 
NO-8 at least 90% sequence identity to SEQ ID NO:8, at least 95% sequence identity to 
SEQ ID NO:8, and/or having the sequence identity of SEQ ID NO:8, the protease being 
isolated from a Cellulomonas spp.. In some embodiments, the protease is isolated from 
Cellulomonas strain 69B4 (DSM 16035). 
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In additional embodiments, the present invention provides engineered variants of the 
serine proteases of the present invention. In some embodiments, the engineered variants 
are genetically modified using recombinant DNA technologies, while in other embodiments, 
the variants are naturally occurring. The present invention further encompasses engineered 
variants of homologous enzymes. In some embodiments, the engineered variant 
homologous proteases are genetically modified using recombinant DNA technologies, while 
in other embodiments, the variant homologous proteases are naturally occurring. 

The present invention also provides serine proteases that immunologically cross- 
react with the Cellulomonas 69B4 protease (i.e., ASP) of the present invention. Indeed, it is 
intended that the present invention encompass fragments (e.g., epitopes) of the ASP 
protease that stimulate an immune response in animals (including, but not limited to 
humans) and/or are recognized by antibodies of any class. The present invention further 
encompasses epitopes on proteases that are cross-reactive with ASP epitopes. In some 
embodiments, the ASP epitopes are recognized by antibodies, but do not stimulate an 
immune response in animals (including, but not limited to humans), while in other 
embodiments, the ASP epitopes stimulate an immune response in at least one animal 
species (including, but not limited to humans) and are recognized by antibodies of any class. 
The present invention also provides means and compositions for identifying and assessing 
cross-reactive epitopes. 

The present invention further provides at least one polynucleotide encoding a signal 
peptide (i) having at least 70% sequence identity to SEQ ID NO:9, or (ii) being capable of 
hybridizing to a probe derived from the polypeptide sequence encoding SEQ ID NO:9, under 
conditions of medium to high stringency, or (Hi) being complementary to the polypeptide 
sequence provided in SEQ ID NO:9. In further embodiments, the present invention provides 
at vectors comprising the polynucleotide described above. In yet additional embodiments, a 
host cell is provided that is transformed with the vector. 

The present invention also provides methods for producing proteases, comprising: 
(a) transforming a host cell with an expression vector comprising a polynucleotide having at 
least 70% sequence identity to SEQ ID NO:4, at least 95% sequence identity to SEQ ID 
NO:4, and/or having a polynucleotide sequence of SEQ ID NO:4; (b) cultivating the 
transformed host cell under conditions suitable for the host cell to produce the protease; and 

(c) recovering the protease. In some embodiments, the host cell is a Bacillus species 
(e.g., B. subtilis, B. clausii, or B. licheniformis). In alternative embodiments, the host cell is a 
Streptomyces spp., (e.g., Streptomyces lividans). In additional embodiments, the host cell 
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is a Trichoderma spp., (e.g., Trichoderma reesei). In yet further embodiments, the host cell 
is a Aspergillus spp. (e.g., Aspergillus niget). 

As will be appreciated, an advantage of the present invention is that a polynucleotide 
has been isolated which provides the capability of isolating further polynucleotides which 
encode proteins having serine protease activity, wherein the backbone is substantially 
identical to that of the Cellulomonas protease of the present invention. 

In further embodiments, the present invention provides means to produce host cells 
that are capable of producing the serine proteases of the present invention in relatively large 
quantities. In particularly preferred embodiments, the present invention provides means to 
produce protease with various commercial applications where degradation or synthesis of 
polypeptides are desired, including cleaning compositions, as well as feed components, 
textile processing, leather finishing, grain processing, meat processing, cleaning, 
preparation of protein hydrolysates, digestive aids, microbicidal compositions, bacteriostatic 
composition, fungistatic compositions, personal care products, including oral care, hair care, 
and/or skin care. 

The present invention further provides enzyme compositions have comparable or 
improved wash performance, as compared to presently used subtilisin proteases. Other 
objects and advantages of the present invention are apparent from the present 
Specification. 

The present invention provides an isolated polypeptide having proteolytic activity, 
(e.g., a protease) having the amino acid sequence set forth in SEQ ID NO:8. In some 
embodiments, the present invention provides isolated polypeptides having approximately 
40% to 98% identity with the sequence set forth in SEQ ID NO:8. In some preferred 
embodiments, the polypeptides have approximately 50% to 95% identity with the sequence 
set forth in SEQ ID NO:8. In some additional preferred embodiments, the polypeptides have 
approximately 60% to 90% identity with the sequence set forth in SEQ ID NO:8. In yet 
additional embodiments, the polypeptides have approximately 65% to 85% identity with the 
sequence set forth in SEQ ID NO:8. In some particularly preferred embodiments, the 
polypeptides have approximately 90% to 95% identity with the sequence set forth in SEQ ID 
NO:8. 

The present invention further provides proteases obtained from bacteria of the 
suborder Micrococcineae. In some preferred embodiments, the proteases are obtained 
from members of the family Promicromonosporaceae. In yet further embodiments, the 
proteases are obtained from any member of the genera Xylanimicrobium, Xylanibacterium, 
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Xylanimonas, Myceligenerans, and Promicromonospora. In some preferred embodiments, 
the proteases are obtained from members of the family Cellulomonadaceae. In some 
particularly preferred embodiments, the proteases are obtained from members of the genera 
Cellulomonas and Oerskovia. In some further preferred embodiments, the proteases are 
derived from Cellulomonas spp. In some embodiments, the Cellulomonas spp. is selected 
from Cellulomonas fimi, Cellulomonas biazotea, Cellulomonas cellasea, Cellulomonas 
hominis, Cellulomonas flavigena, Cellulomonas persica, Cellulomonas iranensis, 
Cellulomonas gelida, Cellulomonas humilata, Cellulomonas turbata, Cellulomonas uda, 
Cellulomonas fermentans, Cellulomonas xylanilytica, Cellulomonas humilata and 
Cellulomonas strain 69B4 (DSM 1 6035). 

In alternative embodiments, the proteases are derived from Oerskovia spp. In some 
pre f errec j embodiments, the Oerskovia spp. is selected from Oerskovia jenensis, Oerskovia 
paurometabola, Oerskovia enterophila, Oerskovia turbata and Oerskovia turbata strain DSM 
20577. 

In some embodiments, the proteases have apparent molecular weights of about 
17kD to 21 kD as determined by a matrix assisted laser desorption/ionizaton - time of flight 
("MALDI-TOF) spectrophotometer. 

The present invention further provides isolated polynucleotides that encode 
proteases comprise an amino acid sequence comprising at least 40% amino acid sequence 
identity to SEQ ID NO:8. In some embodiments, the proteases have at least 50% amino 
acid sequence identity to SEQ ID NO:8. In some embodiments, the proteases have at least 
60% amino acid sequence identity to SEQ ID NO:8. In some embodiments, the proteases 
have at least 70% amino acid sequence identity to SEQ ID NO:8. In some embodiments, 
the proteases have at least 80% amino acid sequence identity to SEQ ID NO:8. In some 
embodiments, the proteases have at least 90% amino acid sequence identity to SEQ ID 
NO:8. In some embodiments, the proteases have at least 95% amino acid sequence 
identity to SEQ ID NO:8. The present invention also provides expression vectors comprising 
any of the polynucleotides provided above. 

The present invention further provides host cells transformed with the expression 
vectors of the present invention, such that at least one protease is expressed by the host 
cells. In some embodiments, the host cells are bacteria, while in other embodiments, the 
host cells are fungi. In some preferred embodiments, the bacterial host cells are selected 
from the group consisting of the genera Bacillus and Streptomyces. In some alternative 
preferred embodiments, the fungal host cells are members of the genus Trichoderma, while 
in other alternative preferred embodiments, the fungal host cells are members of the genus 
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Aspergillus. 

The present invention also provides isolated polynucleotides comprising a nucleotide 
sequence (i) having at least 70% identity to SEQ ID NOS:3 or 4, or (ii) being capable of 
hybridizing to a probe derived from the nucleotide sequence disclosed in SEQ ID NOS: 3 or 
4, under conditions of medium to high stringency, or (Hi) being complementary to the 
nucleotide sequence disclosed in SEQ ID NOS:3 or 4. In some embodiments, the present 
invention provides vectors comprising such polynucleotide. In further embodiments, the 
present invention provides host cells transformed with such vector. 

The present invention further provides methods for producing at least one enzyme 
having protease activity, comprising: the steps of transforming a host cell with an expression 
vector comprising a polynucleotide comprising at least 70% sequence identity to SEQ ID 
NO:4, cultivating the transformed host cell under conditions suitable for the host cell to 
produce the protease; and recovering the protease. In some preferred embodiments, the 
host cell is a Streptomyces spp, while in other embodiments, the host cell is a Bacillus spp„ 
a Trichoderma spp., and/or a Aspergillus spp. In some embodiments, the Streptomyces 
spp. is Streptomyces lividans. In alternative embodiments, the host cell is 7. reeseL In 
further embodiments, the Aspergillus spp. is A. niger. 

The present invention also provides fragments (/.a, portions) of the DNA encoding 
the proteases provided herein^ These fragments find use in obtaining partial length DNA 
fragments capable of being used to isolate or identify polynucleotides encoding mature 
protease enzyme described herein from Cellulomonas 69B4, or a segment thereof having 
proteolytic activity. In some embodiments, portions of the DNA provided in SEQ ID NO:1 
find use in obtaining homologous fragments of DNA from other species, and particularly 
from Micrococcineae spp. which encode a protease or portion thereof having proteolytic 
activity. . 

The present invention further provides at least one probe comprising a 
polynucleotide substantially identical to a fragment of SEQ ID NOS:1, 2, 3 or 4, wherein the 
probe is used to detect a nucleic acid sequence coding for an enzyme having proteolytic 
activity, and wherein the nucleic acid sequence is obtained from a bacterial source. In some 
embodiments, the bacterial source is a Cellulomonas spp. In some preferred embodiments, 
the bacterial source is Cellulomonas strain 69B4. 

The present invention further provides compositions comprising at least one of the 
proteases provided herein. In some preferred embodiments, the compositions are cleaning 
compositions. In some embodiments, the present invention provides cleaning compositions 
comprising a cleaning effective amount of at least one protease comprising an amino acid 
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sequence having at least 40% sequence identity to SEQ ID NO:8, at least 90% sequence 
identity to SEQ ID NO:8, and/or having an amino acid sequence of SEQ ID NO:8. In some 
embodiments, the cleaning compositions further comprise at least one suitable cleaning 
adjunct. In some embodiments, the protease is derived from a Cellulomonas sp. In some 
preferred embodiments, the Cellulomonas spp. is selected from Cellulomonas fimi, 
Cellulomonas biazotea, Cellulomonas cellasea, Cellulomonas hominis, Cellulomonas 
flavigena, Cellulomonas persica, Cellulomonas iranensis, Cellulomonas gelida, 
Cellulomonas humilata, Cellulomonas turbata, Cellulomonas uda, and Cellulomonas strain 
69B4 (DSM 16035). In some particularly preferred embodiments, the Cellulomonas spp is 
Cellulomonas. strain 69B4. In still further embodiments, the cleaning composition further 
comprises at least one additional enzymes or enzyme derivatives selected from the group 
consisting of protease, amylase, lipase, mannanase and cellulase. 

The present invention also provides isolated naturally occurring proteases 
comprising an amino acid sequence having at least 45% sequence identity to SEQ ID NO:8, 
at least 60% sequence identity to SEQ ID NO:8, at least 75% sequence identity to SEQ ID 
NO:8, at least 90% sequence identity to SEQ ID NO:8, at least 95% sequence identity to 
SEQ ID NO:8, and/or having the sequence identity of SEQ ID NO:8, the protease being 
isolated from a Cellulomonas spp.. In some embodiments, the protease is isolated from 
Cellulomonas strain 69B4 (DSM 16035). 

In additional embodiments, the present invention provides engineered variants of the 
serine proteases of the present invention. In some embodiments, the engineered variants 
are genetically modified using recombinant DNA technologies, while in other embodiments, 
the variants are naturally occurring. The present invention further encompasses engineered 
variants of homologous enzymes. In some embodiments, the engineered variant 
homologous proteases are genetically modified using recombinant DNA technologies, while 
in other embodiments, the variant homologous proteases are naturally occurring. 

The present invention also provides serine proteases that immunologically cross- 
react with the ASP protease of the present invention. Indeed, it is intended that the present 
invention encompass fragments (e.g., epitopes) of the ASP protease that stimulate an 
immune response in animals (including, but not limited to humans) and/or are recognized by 
antibodies of any class. The present invention further encompasses epitopes on proteases 
that are cross-reactive with ASP epitopes. In some embodiments, the ASP epitopes are 
recognized by antibodies, but do not stimulate an immune response in animals (including, 
but not limited to humans), while in other embodiments, the ASP epitopes stimulate an 
immune response in at least one animal species (including, but not limited to humans) and 
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are recognized by antibodies of any class. The present invention also provides means and 
compositions for identifying and assessing cross- reactive epitopes. 

The present invention further provides at least one polynucleotide encoding a signal 
peptide (i) having at least 70% sequence identity to SEQ ID NO:9, or (ii) being capable of 
hybridizing to a probe derived from the polypeptide sequence encoding SEQ ID NO:9, under 
conditions of medium to high stringency, or (Hi) being complementary to the polypeptide 
sequence provided in SEQ ID NO:9. In further embodiments, the present invention provides 
at vectors comprising the polynucleotide described above. In yet additional embodiments, a 
host cell is provided that is transformed with the vector. 

The present invention also provides methods for producing proteases, comprising: 
(a) transforming a host cell with an expression vector comprising a polynucleotide having at 
least 70% sequence identity to SEQ ID NO:4, at least 95% sequence identity to SEQ ID 
NO:4, and/or having a polynucleotide sequence of SEQ ID NO:4; (b) cultivating the 
transformed host cell under conditions suitable for the host cell to produce the protease; and 

(c) recovering the protease. In some embodiments, the host cell is a Bacillus species 
(e.g., B. subtilis, B. clausii, or B. licheniformis). In alternative embodiments, the host cell is a 
Streptomyces spp., (e.g., Streptomyces lividans). In additional embodiments, the host cell 
is a Trichoderma spp., (e.g., Trichoderma reesei). In yet further embodiments, the host cell 
is a Aspergillus spp., (e.g., Aspergillus nigei). 

As will be appreciated, an advantage of the present invention is that a polynucleotide 
has been isolated which provides the capability of isolating further polynucleotides which 
encode proteins having serine protease activity, wherein the backbone is substantially 
identical to that of the Cellulomonas protease of the invention. 

In further embodiments, the present invention provides means to produce host cells 
that are capable of producing the serine proteases of the present invention in relatively large 
quantities. In particularly preferred embodiments, the present invention provides means to 
produce protease with various commercial applications where degradation or synthesis of 
polypeptides are desired, including cleaning compositions, as well as feed components, 
textile processing, leather finishing, grain processing, meat processing, cleaning, 
preparation of protein hydrolysates, digestive aids, microbicidal compositions, bacteriostatic 
composition, fungistatic compositions, personal care products, including oral care, hair care/ 
and/or skin care. 

The present invention further provides enzyme compositions have comparable or 
improved wash performance, as compared to presently used subtilisin proteases. Other 
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objects and advantages of the present invention are apparent from the present 
Specification. 

DESCRIPTION OF THE FIGURES 

Figure 1 provides an unrooted phylogenetic tree illustrating the relationship of novel 
strain 69B4 to members of the family Cellulomonadaceae and other related genera of the 
suborder Micrococcineae. 

Figure 2 provides a phylogenetic tree for ASP protease. 

Figure 3 provides a MALDI TOF spectrum of a protease derived from Cellulomonas 
strain 69B4 

Figure 4 shows the sequence of N-terminal most tryptic peptide from C. flavigena 

Figure 5 provides the plasmid map of the pSEGCT vector. 

Figure 6 provides the plasmid map of the pSEGCT69B4 vector. 

Figure 7 provides the plasmid map of the pSEA469BCT vector. 

Figure 8 provides the plasmid map of the pHPLT-Asp-Cl-1 vector. 

Figure 9 provides the plasmid map of the pHPLT-Asp-C1-2 vector. 

Figure 10 provides the plasmid map of the pHPLT-Asp-C2-1 vector. 

Figure 1 1 provides the plasmid map of the pHPLT-Asp-C2-2 vector. 

Figure 12 provides the plasmid map of the pHPLT-ASP-lll vector. 

Figure 13 provides the plasmid map of the pHPLT-ASP-IV vector. 

Figure 14 provides the plasmid map of the pHPLT-ASP-VII vector. 

Figure 15 provides the plasmid map of the pXX-Kpnl vector. 

Figure 16 provides the plasmid map of the p2JM103-DNNP1 vector. 

Figure 17 provides the plasmid map of the pHPLT vector. 

Figure 18 provides the map and MXL-prom sequences for the opened pHPLT-ASP- 

C1-2. 

Figure 1 9 provides the plasmid map of the pENMx3 vector. 
Figure 20 provides the plasmid map of the pICatH vector. 

Figure 21 provides the plasmid map of the pTREX4 vector. 

Figure 22 provides the plasmid map of the pSLGAMpR2 vector. 

Figure 23 provides the plasmid map of the pRAXdes2-ASP vector. 

Figure 28 provides the plasmid map of the pAPDI vector. 

Figure 25 provides graphs showing ASP autolysis. Panel A provides a graph 
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showing the ASP autolysis peptides observed in a buffer without LAS. Panel B provides a 
graph showing the ASP autolysis peptides observed in a buffer with 0.1% LAS. 

Figure 26 compares the cleaning activity (absorbance at 405 nm) dose (ppm) 
response curves of certain serine proteases (69B4 [-x-]; PURAFECT® [-♦-]; RELASE™ [- 
A-]; and OPTIMASE™ [-■-] in liquid TIDE® detergent under North American wash 
conditions. 

Figure 27 provides a graph that compares the cleaning activity (absorbance at 405 
nm) dose (ppm) response curves of certain serine proteases (69B4 [-x-]; PURAFECT® [-♦- 
]; RELASE™ [-▲-]; and OPTIMASE™ [-■-] in Detergent Composition III powder detergent 
(0.66 g/l) North American concentration/detergent formulation under Japanese wash 
conditions. 

Figure 28 provides a graph that compares the cleaning activity (absorbance at 405 
nm) dose (ppm) response curves of certain serine proteases (69B4 [-x-]; PURAFECT® [-♦- 
]; RELASE™ [-A-]; and OPTIMASE™ [-■-] in ARIEL® REGULAR detergent powder under 
European wash conditions. 

Figure 29 provides a graph that compares the cleaning activity (absorbance at 405 
nm) dose (ppm) response curves of certain serine protease (69B4 [-x-]; PURAFECT® [-♦- ] 
RELASE™ [-A-]; and OPTIMASE™ [-■-] in PURE CLEAN detergent powder under 
Japanese conditions. 

Figure 30 provides a graph that compares the cleaning activity (absorbance at 405 
nm) dose (ppm) response curves of certain serine proteases (69B4 [-x-]; PURAFECT® [-♦- 
]; RELASE™ [-A-J; and OPTIMASE™ [-■-] in Detergent Composition III powder (1.00 g/l) 
under North American conditions. 

Figure 31 provides a graph that shows comparative oxidative inactivation of various 
serine proteases (100 ppm) as a measure of per cent enzyme activity over time (minutes) 
(69B4 [-x-]; BPN' variant 1 [-♦- ]; PURAFECT® [- A-]; and GG36-variant 1 [-«-]) : with 0.1 M 
H 2 0 2 at pH 9.45, 25°C. 

Figure 32 provides a graph that shows comparative chelator inactivation of various 
serine proteases (100 ppm) as a measure of per cent enzyme activity over time (minutes) 
(69B4 [-x-]; BPN'-variant 1 [-♦- ]; PURAFECT® [-A-]; and GG36-variant 1 [-■-] with 10mM 
EDTA at pH 8.20, 45°C. 

Figure 33 provides a graph that shows comparative thermal inactivation of various 
serine proteases (100 ppm) as a measure of percent enzyme activity over time (minutes) 
(69B4 [-x-]; BPN'-variant [-♦-]; PURAFECT® [-A-]; and GG36-variant 1 [-■-] with 50 mM 
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Tris at pH 8.0, 45°C. 

Figure 34 provides a graph that shows comparative thermal inactivation of certain 
serine proteases (69B4 [-x-]; BPN'-variant [-♦-]; PURAFECT® [- A-]; and GG36-variant-i [- 
■-] at pH 8.60, over a temperature gradient of 57*C to 62'C. 

Figure 35 provides a graph that shows enzyme activity (hydrolysis of di-methyl 
casein measured by absorbance at 405 nm) of certain serine proteases (2.5 ppm) (69B4 [- 
■- ]; BPN'-variant [-♦- PURAFECT® [-A-]; and GG36-variant 1[ -• -] at pH 's ranging from 
5to12at37°C. 

Figure 36 provides a bar graph that shows enzyme stability as indicated by % 
remaining activity (hydrolysis of di-methyl casein measured by absorbance at 405 nm) of 
certain serine proteases (2.5 ppm) (69B4, BPN'- variant; PURAFECT® and GG36-variant 1 
at pHs ranging from 3 (| ), 4 (^ ), 5 ( g ) to 6 ( ^ ) at 25*. 35', and 45°C, 
respectively. 

Figure 37 provides a graph that shows enzyme stability as indicated by % remaining 
activity of a BPN'-variant at pH ranges from 3 (-♦-), 4 (-■-)■ 5 ( A - ) to 6 (-X-) at 25°, 
35°, and 45°C, respectively 

Figure 38 provides a graph that shows enzyme stability as indicated by % remaining 
activity of PURAFECT® TM protease at pH ranges from 3 (-♦- ), 4 (--■--), 5 (--A— ) to 6 (-- 
X-) at 25°, 35°, and 45°C, respectively 

Figure 39 provides a graph that shows enzyme stability as indicated by % remaining 
activity of 69B4 protease at pH ranges from 3 (-♦-), 4 (--■--), 5 ( - A- ) to 6 (-X-) at 25 \ 
35° and 45"C, respectively 

DESCRIPTION OF THE INVENTION 

The present invention provides novel serine proteases, novel genetic material 
encoding these enzymes, and proteolytic proteins obtained from Micrococcineae spp., 
including but not limited to Cellulomonas spp. and variant proteins developed therefrom. In 
particular, the present invention provides protease compositions obtained from a 
Cellulomonas spp, DNA encoding the protease, vectors comprising the DNA encoding the 
protease, host cells transformed with the vector DNA, and an enzyme produced by the host 
cells. The present invention also provides cleaning compositions (e.g., detergent 
compositions), animal feed compositions, and textile and leather processing compositions 
comprising protease(s) obtained from a Micrococcineae spp., including but not limited to 
Cellulomonas spp. In alternative embodiments, the present invention provides mutant (i.e., 
variant) proteases derived from the wild-type proteases described herein. These mutant 
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proteases also find use in numerous applications! 

Gram-positive alkalophilic bacteria have been isolated from in and around alkaline 
soda lakes (See e.g., U.S. Pat. No. 5,401 ,657, herein incorporated by reference). These 
alkalophilic were analyzed according to the principles of numerical taxonomy with respect to 
each other and also a collection of known bacteria, and taxonomically characterized. Six 
natural clusters or phenons of alkalophilic bacteria were generated. Amongst the strains 
isolated was a strain identified as 69B4. 

Cellulomonas spp. are Gram-positive bacteria classified as members of the family 
Cellulomonadaceae, Suborder Micrococcineae, Order Actinomycetales, Class 
Actinobacteria. Cellulomonas grows as slender, often irregular rods that may occasionally 
show branching, but no mycelium is formed. In addition, there is no aerial growth and no 
spores are formed. Cellulomonas and Streptomyces are only distantly related at a genetic 
level. The large genetic (genomic) distinction between Cellulomonas and Streptomyces is 
reflected in a great difference in phenotypic properties. While serine proteases in 
Streptomyces have been previously examined, there apparently have been no reports of 
any serine proteases (approx. MW 18,000 to 20,000) secreted by Cellulomonas spp. In 
addition, there apparently have been no previous reports of Cellulomonas proteases being 
used in the cleaning and/or feed industry. 

Streptomyces are Gram-positive bacteria classified as members of the Family 
Streptomycetaceae, Suborder Streptomycineae, Order Actinomycetales, class 
Actinobacteria. Streptomyces grows as an extensively branching primary or substrate 
mycelium and an abundant aerial mycelium that at maturity bear characteristic spores. 
Streptogrisins are serine proteases secreted in large amounts from a wide variety of 
Streptomyces species. The amino acid sequences of Streptomyces proteases have been 
determined from at least 9 different species of Streptomyces including Streptomyces griseus 
Streptogrisin C (accession no. P52320); alkaline proteinase (EC 3.4.21.-) from 
Streptomyces sp. (accession no. PC2053); alkaline serine proteinase I from Streptomyces 
sp. (accession no. S34672), serine protease from Streptomyces lividans (accession no. 
CAD4208); putative serine protease from Streptomyces coelicolor A3(2) (accession no. 
NP_625129); putative serine protease from Streptomyces avermitilis MA-4680 (accession 
no. NP_822175); serine protease from Streptomyces lividans (accession no. CAD42809); 
putative serine protease precursor from Streptomyces coelicolor A3(2) (accession no. 
NP_628830)). A purified native alkaline protease having an apparent molecular weight of 
19,000 daltons and isolated from Streptomyces griseus var. alcalophilus protease and 
cleaning compositions comprised thereof have been described (See e.g., U.S. Patent No. 
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5,646,028, incorporated herein by reference). 

The present invention provides protease enzymes produced by these organisms. 
Importantly, these enzymes have good stability and proteolytic activity. These enzymes find 
use in various applications, including but not limited to cleaning compositions, animal feed, 
textile processing and etc. The present invention also provides means to produce these 
enzymes. In some preferred embodiments, the proteases of the present invention are in 
pure or relatively pure form. 

The present invention also provides nucleotide sequences which are suitable to 
produce the proteases of the present invention in recombinant organisms. In some 
embodiments, recombinant production provides means to produce the proteases in 
quantities that are commercially viable. 

Unless otherwise indicated, the practice of the present invention involves 
conventional techniques commonly used in molecular biology, microbiology, and 
recombinant DNA, which are within the skill of the art. Such techniques are known to those 
of skill in the art and are described in numerous texts and reference works (See e.g., 
Sambrook et al., "Molecular Cloning: A Laboratory Manual", Second Edition (Cold Spring 
Harbor), [1989]); and Ausubel et a/., "Current Protocols in Molecular Biology" [1987]). All 
patents, patent applications, articles and publications mentioned herein, both supra and 
infra, are hereby expressly incorporated herein by reference. 

Unless defined otherwise herein, all technical and scientific terms used herein have 
the same meaning as commonly understood by one of ordinary skill in the art to which this 
invention pertains. For example, Singleton and Sainsbury, Dictionary of Microbiology and 
Molecular Biology, 2d Ed., John Wiley and Sons, NY (1994); and Hale and Marham, The 
Harper Collins Dictionary of Biology, Harper Perennial, NY (1991) provide those of skill in 
the art with a general dictionaries of many of the terms used in the invention. Although any 
methods and materials similar or equivalent to those described herein find use in the 
practice of the present invention, the preferred methods and materials are described herein. 
Accordingly, the terms defined immediately below are more fully described by reference to 
the Specification as a whole. Also, as used herein, the singular "a", "an" and "the" includes 
the plural reference unless the context clearly indicates otherwise. Numeric ranges are 
inclusive of the numbers defining the range. Unless otherwise indicated, nucleic acids are 
written left to right in 5' to 3' orientation; amino acid sequences are written left to right in 
amino to carboxy orientation, respectively. It is to be understood that this invention is not 
limited to the particular methodology, protocols, and reagents described, as these may vary, 
depending upon the context they are used by those of skill in the art. 
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The practice of the present invention employs, unless otherwise indicated, 
conventional techniques of protein purification, molecular biology, microbiology, recombinant 
DNA techniques and protein sequencing, all of which are within the skill of those in the art. 

Furthermore, the headings provided herein are not limitations of the various aspects 
or embodiments of the invention which can be had by reference to the specification as a 
whole. Accordingly, the terms defined immediately below are more fully diefined by 
reference to the specification as a whole. Nonetheless, in order to facilitate understanding 
of the invention, a number of terms are defined below. 

I. Definitions 

As used herein, the terms "protease," and "proteolytic activity" refer to a protein or 
peptide exhibiting the ability to hydrolyze peptides or substrates having peptide linkages. 
Many well known procedures exist for measuring proteolytic activity (Kalisz, "Microbial 
Proteinases, 11 In: Fiechter (ed.), Advances in Biochemical Engineering/Biotechnology , 
[1988]). For example, proteolytic activity may be ascertained by comparative assays which 
analyze the respective protease's ability to hydrolyze a commercial substrate. Exemplary 
substrates useful in the such analysis of protease or protelytic activity, include, but are not 
limited to di-methyl casein (Sigma C-9801), bovine collagen (Sigma C-9879), bovine elastin 
(Sigma E-1625), and bovine keratin (ICN Biomedical 902111). Colorimetric assays utilizing 
these substrates are well known in the art (See e.g., WO 99/3401 1 ; and U.S. Pat. No. 
6,376,450, both of which are incorporated herein by reference. The pNA assay (See e.g., 
Del Mar etal., Anal. Biochem., 99:316-320 [1979]) also finds use in determining the active 
enzyme concentration for fractions collected during gradient elution. This assay measures 
the rate at which p-nitroaniline is released as the enzyme hydrolyzes the soluble synthetic 
substrate, succinyl-alanine-alanine-proline-phenylalanine-p-nitroanilide (sAAPF*pNA). The 
rate of production of yellow color from the hydrolysis reaction is measured at 410 nm on a 
spectrophotometer and is proportional to the active enzyme concentration. In addition, 
absorbance measurements at 280 nm can be used to determine the total protein 
concentration. The active enzyme/total-protein ratio gives the enzyme purity. 

As used herein, the terms "ASP protease," "Asp protease," and "Asp," refer to the 
serine proteases described herein. In some preferred embodiments, the Asp protease is 
the protease designed herein as 69B4 protease obtained from Cellulomonas strain 69B4. 
Thus, in preferred embodiments, the term "69B4 protease" refers to a naturally occurring 
mature protease derived from Cellulomonas strain 69B4 (DSM 16035) having substantially 
identical amino acid sequences as provided in SEQ ID NO:8. In alternative embodiments, 
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the present invention provides portions of the ASP protease. 

The term "Cellulomonas protease homologies" refers to naturally occurring 
proteases having substantially identical amino acid sequences to the mature protease 
derived from Cellulomonas strain 69B4 or polynucleotide sequences which encode for such 
naturally occurring proteases, and which proteases retain the functional characteristics of a 
serine protease encoded by such nucleic acids. In some embodiments, these protease 
homologues are referred to as "cellulomonadins." 

As used herein, the terms "protease variant," "ASP variant," "ASP protease variant," 
and "69B protease variant" are used in reference to proteases that are similar to the wild- 
type ASP, particularly in their function, but have mutations in their amino acid sequence that 
make them different in sequence from the wild-type protease. 

As used herein, "Cellulomonas ssp." refers to all of the species within the genus 
"Cellulomonas" which are Gram-positive bacteria classified as members of the Family 
Cellulomonadaceae, Suborder Micrococcineae, Order Actinomycetales, Class 
Actinobacteria. It is recognized that the genus Cellulomonas continues to undergo 
taxonomical reorganization. Thus, it is intended that the genus include species that have 
been reclassified 

As used herein, "Streptomyces ssp." refers to all of the species within the genus 
"Streptomyces" which are Gram-positive bacteria classified as members of the Family 
Streptomycetaceae, Suborder Streptomycineae, Order Actinomycetales, class 
Actinobacteria. It is recognized that the genus Streptomyces continues to undergo 
taxonomical reorganization. Thus, it is intended that the genus include species that have 
been reclassified 

As used herein, "the genus BacillusT includes all species within the genus "Bacillus" 
as known to those of skill in the art, including but not limited to B. subtilis, B. licheniformis, B. 
lentus, B. brevis, B. stearothermophilus, B. alkalophilus, B. amyloliquefaciens, B. clausii, B. 
halodurans, B. megaterium, B. coagulans, B. circulans, B. lautus, and B. thuringiensis. It is 
recognized that the genus Bacillus continues to undergo taxonomical reorganization. Thus, 
it is intended that the genus include species that have been reclassified, including but not 
limited to such organisms as B. stearothermophilus, which is now named "Geobacillus 
stearothermophilus" The production of resistant endospores in the presence of oxygen is 
considered the defining feature of the genus Bacillus, although this characteristic also 
applies to the recently named Alicyclobacillus, Amphibacillus, Aneurinibacillus, 
Anoxybacillus, Brevibacillus, Filobacillus, Gracilibacillus, Halobacillus, Paenibacillus, 
Salibacillus, Thermobacillus, Ureibacillus, and Virgibacillus. 
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The terms "polynucleotide" and "nucleic acid", used interchangeably herein, refer to 
a polymeric form of nucleotides of any length, either ribonucleotides or 
deoxyribonucleotides. These terms include, but are not limited to, a single-, double- or 
triple-stranded DNA, genomic DNA, cDNA, RNA, DNA-RNA hybrid, or a polymer comprising 
purine and pyrimidine bases, or other natural, chemically, biochemically modified, non- 
natural or derivatized nucleotide bases. The following are non-limiting examples of 
polynucleotides: genes, gene fragments, chromosomal fragments, ESTs, exons, introns, 
mRNA, tRNA, rRNA, ribozymes, cDNA, recombinant polynucleotides, branched 
polynucleotides, plasmids, vectors, isolated DNA of any sequence, isolated RNA of any 
sequence, nucleic acid probes, and primers. In some embodiments, polynucleotides 
comprise modified nucleotides, such as methylated nucleotides and nucleotide analogs, 
uracil, other sugars and linking groups such as fluororibose and thioate, and nucleotide 
branches. In alternative embodiments, the sequence of. nucleotides is interrupted by non- 
nucleotide components. 

As used herein, the terms "DNA construct" and "transforming DNA" are used 
interchangeably to refer to DNA used to introduce sequences into a host cell or organism. 
The DNA may be generated in vitro by PGR or any other suitable technique(s) known to 
those in the art. In particularly preferred embodiments, the DNA construct comprises a 
sequence of interest (e.g., as an incoming sequence). In some embodiments, the sequence 
is operably linked to additional elements such as control elements (e.g., promoters, etc.). 
The DNA construct may further comprise a selectable marker. It may further comprise an 
incoming sequence flanked by homology boxes. In a further embodiment, the transforming 
DNA comprises other non-homologous sequences, added to the ends (e.g., stuffer 
sequences or flanks). In some embodiments, the ends of the incoming sequence are 
closed such that the transforming DNA forms a closed circle. The transforming sequences 
may be wild-type, mutant or modified. In some embodiments, the DNA construct comprises 
sequences homologous to the host cell chromosome. In other embodiments, the DNA 
construct comprises non-homologous sequences. Once the DNA construct is assembled in 
vitro it may be used to: 1) insert heterologous sequences into a desired target sequence of 
a host cell, and/or 2) mutagenize a region of the host cell chromosome (i.e., replace an 
endogenous sequence with a heterologous sequence), 3) delete target genes; and/or 
introduce a replicating plasmid into the host. 

As used herein, the terms "expression cassette" and "expression vector" refer to 
nucleic acid constructs generated recombinantly or synthetically, with a series of specified 
nucleic acid elements that permit transcription of a particular nucleic acid in a target cell. 



WO 2005/052146 



PCT/US2004/039066 



-33- 

The recombinant expression cassette can be incorporated into a plasmid, chromosome, 
mitochondrial DNA, plastid DNA, virus, or nucleic acid fragment. Typically, the recombinant 
expression cassette portion of an expression vector includes, among other sequences, a 
nucleic acid sequence to be transcribed and a promoter. In preferred embodiments, 
expression vectors have the ability to incorporate and express heterologous DNA fragments 
in a host cell. Many prokaryotic and eukaryotic expression vectors are commercially 
available. Selection of appropriate expression vectors is within the knowledge of those of 
skill in the art. The term "expression cassette" is used interchangeably herein with "DNA 
construct," and their grammatical equivalents. Selection of appropriate expression vectors is 
within the knowledge of those of skill in the art. 

As used herein, the term "vector refers to a polynucleotide construct designed to 
introduce nucleic acids into one or more cell types. Vectors include cloning vectors, 
expression vectors, shuttle vectors, plasmids, cassettes and the like, in some 
embodiments, the polynucleotide construct comprises a DNA sequence encoding the 
protease {e.g., precursor or mature protease) that is operably linked to a suitable 
prosequence (e.g., secretory, etc.) capable of effecting the expression of the DNA in a 
suitable host. 

As used herein, the term "plasmid" refers to a circular double-stranded (ds) DNA 
construct used as a cloning vector, and which forms an extrachromosomal self-replicating 
genetic element in some eukaryotes or prokaryotes, or integrates into the host 
chromosome. 

As used herein in the context of introducing a nucleic acid sequence into a cell, the 
term "introduced" refers to any method suitable for transferring the nucleic acid sequence 
into the celL Such methods for introduction include but are not limited to protoplast fusion, 
transfection, transformation, conjugation, and transduction (See e.g., Ferrari etal, 
"Genetics," in Hardwood etal, (eds.), Bacillus . Plenum Publishing Corp., pages 57-72, 
[1989]). 

As used herein, the terms "transformed" and "stably transformed" refers to a cell that 
has a non-native (heterologous) polynucleotide sequence integrated into its genome or as 
an episomal plasmid that is maintained for at least two generations. 

As used herein, the term "selectable marker-encoding nucleotide sequence" refers to 
a nucleotide sequence which is capable of expression in the host cells and where 
expression of the selectable marker confers to cells containing the expressed gene the 
ability to grow in the presence of a corresponding selective agent or lack of an essential 
nutrient. 
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As used herein, the terms "selectable marker" and "selective marker refer to a 
nucleic acid (e.g., a gene) capable of expression in host cell which allows for ease of 
selection of those hosts containing the vector. Examples of such selectable markers include 
but are not limited to antimicrobials. Thus, the term "selectable marker* refers to genes that 
provide an indication that a host cell has taken up an incoming DNA of interest or some 
other reaction has occurred. Typically, selectable markers are genes that confer 
antimicrobial resistance or a metabolic advantage on the host cell to allow cells containing 
the exogenous DNA to be distinguished from cells that have not received any exogenous 
sequence during the transformation. A "residing selectable marker" is one that is located on 
the chromosome of the microorganism to be transformed. A residing selectable marker 
encodes a gene that is different from the selectable marker on the transforming DNA 
construct. Selective markers are well known to those of skill in the art. As indicated above, 
preferably the marker is an antimicrobial resistant marker (e.g., amp R ; phleo R ; spec R ; kan R ; 
ery R ; tet R ; cmp R ; and neo R ; See e.g., Guerot-Fleury, Gene, 167:335-337 [1995]; Palmeros 
etal., Gene 247:255-264 [2000]; and Trieu-Cuot etal., Gene, 23:331-341 [1983]). Other 
markers useful in accordance with the invention include, but are not limited to auxotrophic 
markers, such as tryptophan; and detection markers, such as |3- galactosidase. 

As used herein, the term "promoter" refers to a nucleic acid sequence that functions 
to direct transcription of a downstream gene. In preferred embodiments, the promoter is 
appropriate to the host cell in which the target gene is being expressed. The promoter, 
together with other transcriptional and translational regulatory nucleic acid sequences (also 
termed "control sequences") is necessary to express a given gene. In general, the 
transcriptional and translational regulatory sequences include, but are not limited to, 
promoter sequences, ribosomal binding sites, transcriptional start and stop sequences, 
translational start and stop sequences, and enhancer or activator sequences. 

A nucleic acid is "operably linked" when it is placed into a functional relationship with 
another nucleic acid sequence. For example, DNA encoding a secretory leader (i.e., a 
signal peptide), is operably linked to DNA for a polypeptide if it is expressed as a preprotein 
that participates in the secretion of the polypeptide; a promoter or enhancer is operably 
linked to a coding sequence if it affects the transcription of the sequence; or a ribosome 
binding site is operably linked to a coding sequence if it is positioned so as to facilitate 
translation. Generally, "operably linked" means that the DNA sequences being linked are 
contiguous, and, in the case of a secretory leader, contiguous and in reading phase. 
However, enhancers do not have to be contiguous. Linking is accomplished by ligation at 
convenient restriction sites. If such sites do not exist, the synthetic oligonucleotide adaptors 
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or linkers are used in accordance with conventional practice. 

As used herein the term "gene" refers to a polynucleotide (e.g., a DNA segment), 
that encodes a polypeptide and includes regions preceding and following the coding regions 
as well as intervening sequences (introns) between individual coding segments (exons). 

As used herein, "homologous genes" refers to a pair of genes from different, but 
usually related species, which correspond to each other and which are identical or very 
similar to each other. The term encompasses genes that are separated by speciation (i.e., 
the development of new species) (e.g., orthologous genes), as well as genes that have been 
separated by genetic duplication (e.g., paralogous genes). 

As used herein, "ortholog" and "orthologous genes" refer to genes in different 
species that have evolved from a common ancestral gene (i.e., a homologous gene) by 
speciation. Typically, orthologs retain the same function during the course of evolution. 
Identification of orthologs finds use in the reliable prediction of gene function in newly 
sequenced genomes. 

As used herein, "paralog" and "paralogous genes" refer to genes that are related by 
duplication within a genome. While orthologs retain the same function through the course of 
evolution, paralogs evolve new functions, even though some functions are often related to 
the original one. Examples of paralogous genes include, but are not limited to genes 
encoding trypsin, chymotrypsin, elastase, and thrombin, which are all serine proteinases and 
occur together within the same species. 

As used herein, "homology" refers to sequence similarity or identity, with identity 
being preferred. This homology is determined using standard techniques known in the art 
(See e.g., Smith and Waterman, Adv. Appl. Math., 2:482 [1981]; Needleman and Wunsch, 
J. Mol. Biol., 48:443 [1970]; Pearson and Lipman, Proc. Natl. Acad. Sci. USA 85:2444 
[1988]; programs such as GAP, BESTFIT, FASTA, and TFASTA in the Wisconsin Genetics 
Software Package (Genetics Computer Group, Madison, Wl); and Devereux etal., Nucl. 
Acid Res., 12:387-395 [1984]). 

As used herein, an "analogous sequence" is one wherein the function of the gene is 
essentially the same as the gene based on the Cellulomonas strain 69B4 protease. 
Additionally, analogous genes include at least 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 
85%, 90%, 95%, 97%, 98%, 99% or 100% sequence identity with the sequence of the 
Cellulomonas strain 69B4 protease. Alternately, analogous sequences have an alignment 
of between 70 to 100% of the genes found in the Cellulomonas strain 69B4 protease region 
and/or have at least between 5-10 genes found in the region aligned with the genes in the 
Cellulomonas strain 69B4 chromosome. In additional embodiments more than one of the 
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above properties applies to the sequence. Analogous sequences are determined by known 
methods of sequence alignment. A commonly used alignment method is BLAST, although 
as indicated above and below, there are other methods that also find use in aligning 
sequences. 

One example of a useful algorithm is PILEUP. PILEUP creates a multiple sequence 
alignment from a group of related sequences using progressive, pair-wise alignments. It 
can also plot a tree showing the clustering relationships used to create the alignment. 
PILEUP uses a simplification of the progressive alignment method of Feng and Doolittle 
(Feng and Doolittle, J. Mol. EvoK, 35:351-360 [1987]). The method is similar to that 
described by Higgins and Sharp (Higgins and Sharp, CABIOS 5:151-153 [1989]). Useful 
PILEUP parameters including a default gap weight of 3.00, a default gap length weight of 
0.10, and weighted end gaps. 

Another example of a useful algorithm is the BLAST algorithm, described by Altschul 
et a/ M (Altschul etal., J. Mol. Biol., 215:403-410, [1990]; and Karlin etaL, Proc. Natl. Acad. 
Sci, USA 90:5873-5787 [1993]). A particularly useful BLAST program is the WU-BLAST-2 
program (See, Altschul etaL, Meth. Enzymol., 266:460-480 [1996]). WU-BLAST-2 uses 
several search parameters, most of which are set to the default values. The adjustable 
parameters are set with the following values: overlap span =1, overlap fraction = 0.125, 
word threshold (T) = 1 1. The HSP S and HSP S2 parameters are dynamic values and are 
established by the program itself depending upon the composition of the particular 
sequence and composition of the particular database against which the sequence of interest 
is being searched. However, the values may be adjusted to increase sensitivity. A % amino 
acid sequence identity value is determined by the number of matching identical residues 
divided by the total number of residues of the "longer" sequence in the aligned region. The 
"longer" sequence is the one having the most actual residues in the aligned region (gaps 
introduced by WU-Blast-2 to maximize the alignment score are ignored). 

Thus, "percent (%) nucleic acid sequence identity" is defined as the percentage of 
nucleotide residues in a candidate sequence that are identical with the nucleotide residues 
of the starting sequence (i.e., the sequence of interest). A preferred method utilizes the 
BLASTN module of WU-BLAST-2 set to the default parameters, with overlap span and 
overlap fraction set to 1 and 0.125, respectively. 

As used herein, the term "hybridization" refers to the process by which a strand of 
nucleic acid joins with a complementary strand through base pairing, as known in the art. 

A nucleic acid sequence is considered to be "selectively hybridizable" to a reference 
nucleic acid sequence if the two sequences specifically hybridize to one another under 
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moderate to high stringency hybridization and wash conditions. Hybridization conditions are 
based on the melting temperature (Tm) of the nucleic acid binding complex or probe. For 
example, "maximum stringency" typically occurs at about Tm-5°C (5° below the Tm of the 
probe); "high stringency" at about 5-1 0°C below the Tm; "intermediate stringency" at about 
1 0-20°C below the Tm of the probe; and "low stringency" at about 20-25°C below the Tm. 
Functionally, maximum stringency conditions may be used to identify sequences having 
strict identity or near-strict identity with the hybridization probe; while an intermediate or low 
stringency hybridization can be used to identify or detect polynucleotide sequence 
homologs. 

Moderate and high stringency hybridization conditions are well known in the art. An 
example of high stringency conditions includes "hybridization at about 42°C in 50% 
formamide, 5X SSC, 5X Denhardt's solution, 0.5% SDS and 100 jug/ml denatured carrier 
DNA followed by washing two times in 2X SSC and 0.5% SDS at room temperature and two 
additional times in 0.1 X SSC and 0.5% SDS at 42°C. An example of moderate stringent 
conditions include an overnight incubation at 37°C in a solution comprising 20% formamide, 
5 x SSC (150mM NaCI, 15 mM trisodium citrate), 50 mM sodium phosphate (pH 7.6), 5 x 
Denhardt's solution, 10% dextran sulfate and 20 mg/ml denatured sheared salmon sperm 
DNA, followed by washing the filters in 1x SSC at about 37 - 50°C. Those of skill in the art 
know how to adjust the temperature, ionic strength, etc. as necessary to accommodate 
factors such as probe length and the like. 

As used herein, "recombinant" includes reference to a cell or vector, that has been 
modified by the introduction of a heterologous nucleic acid sequence or that the cell is 
derived from a cell so modified. Thus, for example, recombinant cells express genes that 
are not found in identical form within the native (non-recombinant) form of the cell or 
express native genes that are otherwise abnormally expressed, under expressed or not 
expressed at all as a result of deliberate human intervention. "Recombination," 
"recombining," and generating a "recombined" nucleic acid are generally the assembly of 
two or more nucleic acid fragments wherein the assembly gives rise to a chimeric gene. 

In a preferred embodiment, mutant DNA sequences are generated with site 
saturation mutagenesis in at least one codon. In another preferred embodiment, site 
saturation mutagenesis is performed for two or more codons. In a further embodiment, 
mutant DNA sequences have more than 50%, more than 55%, more than 60%, more than 
65%, more than 70%, more than 75%, more than 80%, more than 85%, more than 90%, 
more than 95%, or more than 98% homology with the wild-type sequence. In alternative 
embodiments, mutant DNA is generated in vivo using any known mutagenic procedure such 
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as, for example, radiation, nitrosoguanidine and the like. The desired DNA sequence is then 
isolated and used in the methods provided herein. 

As used herein, the term "target sequence" refers to a DNA sequence in the host cell 
that encodes the sequence where it is desired for the incoming sequence to be inserted into 
the host cell genome. In some embodiments, the target sequence encodes a functional 
wild-type gene or operon, while in other embodiments the target sequence encodes a 
functional mutant gene or operon, or a non-functional gene or operon. 

As used herein, a "flanking sequence" refers to any sequence that is either upstream 
or downstream of the sequence being discussed (e.g., for genes A-B-C, gene B is flanked 
by the A and C gene sequences). In a preferred embodiment, the incoming sequence is 
flanked by a homology box on each side. In another embodiment, the incoming sequence 
and the homology boxes comprise a unit that is flanked by stuffer sequence on each side. 
In some embodiments, a flanking sequence is present on only a single side (either 3' or 5'), 
but in preferred embodiments, it is on each side of the sequence being flanked; In some 
embodiments, a flanking sequence is present on only a single side (either 3' or 5'), while in 
preferred embodiments, it is present on each side of the sequence being flanked. 

As used herein, the term "stuffer sequence" refers to any extra DNA that flanks 
homology boxes (typically vector sequences). However, the term encompasses any non- 
homologous DNA sequence. Not to be limited by any theory, a stuffer sequence provides a 
noncritical target for a cell to initiate DNA uptake. 

As used herein, the terms "amplification" and "gene amplification" refer to a process 
by which specific DNA sequences are disproportionately replicated such that the amplified 
gene becomes present in a higher copy number than was initially present in the genome. In 
some embodiments, selection of cells by growth in the presence of a drug {e.g., an inhibitor 
of an inhibitable enzyme) results in the amplification of either the endogenous gene 
encoding the gene product required for growth in the presence of the drug or by 
amplification of exogenous {i.e., input) sequences encoding this gene product, or both. 

"Amplification" is a special case of nucleic acid replication involving template 
specificity. It is to be contrasted with non-specific template replication (i.e., replication that is 
template-dependent but not dependent on a specific template). Template specificity is here 
distinguished from fidelity of replication (i.e., synthesis of the proper polynucleotide 
sequence) and nucleotide (ribo- or deoxyribo-) specificity. Template specificity is frequently 
described in terms of "target" specificity. Target sequences are "targets" in the sense that 
they are sought to be sorted out from other nucleic acid. Amplification techniques have 
been designed primarily for this sorting out. 
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As used herein, the term "co-amplification" refers to the introduction into a single cell 
of an amplifiable marker in conjunction with other gene sequences (i.e., comprising one or 
more non-selectable genes such as those contained within an expression vector) and the 
application of appropriate selective pressure such that the cell amplifies both the amplifiable 
marker and the other, non-selectable gene sequences. The amplifiable marker may be 
physically linked to the other gene sequences or alternatively two separate pieces of DNA, 
one containing the amplifiable marker and the other containing the non-selectable marker, 
may be introduced into the same cell. 

As used herein, the terms "amplifiable marker," "amplifiable gene," and "amplification 
vector" refer to a gene or a vector encoding a gene which permits the amplification of that 
gene under appropriate growth conditions. 

'Template specificity" is achieved in most amplification techniques by the choice of 
enzyme. Amplification enzymes are enzymes that, under conditions they are used, will 
process only specific sequences of nucleic acid in a heterogeneous mixture of nucleic acid. 
For example, in the case of Q0 replicase, MDV-1 RNA is the specific template for the 
replicase (See e.g., Kacian etai, Proc. Natl. Acad. Sci. USA 69:3038 [1972]). Other nucleic 
acids are not replicated by this amplification enzyme. Similarly, in the case of 17 RNA 
polymerase, this amplification enzyme has a stringent specificity for its own promoters (See, 
Chamberlin et ai. Nature 228:227 [1970]). In the case of T4 DNA ligase, the enzyme will 
not ligate the two oligonucleotides or polynucleotides, where there is a mismatch between 
the oligonucleotide or polynucleotide substrate and the template at the ligation junction 
(See, Wu and Wallace, Genomics 4:560 [1989]). Finally, Tag and Pfu polymerases, by 
virtue of their ability to function at high temperature, are found to display high specificity for 
the sequences bounded and thus defined by the primers; the high temperature results in 
thermodynamic conditions that favor primer hybridization with the target sequences and not 
hybridization with non-target sequences. 

As used herein, the term "amplifiable nucleic acid 0 refers to nucleic acids which may 
be amplified by any amplification method. It is contemplated that "amplifiable nucleic acid" 
will usually comprise "sample template." 

As used herein, the term "sample template" refers to nucleic acid originating from a 
sample which is analyzed for the presence of "target" (defined below). In contrast, 
"background template" is used in reference to nucleic acid other than sample template 
which may or may not be present in a sample. Background template is most often 
inadvertent. It may be the result of carryover, or it may be due to the presence of nucleic 
acid contaminants sought to be purified away from the sample. For example, nucleic acids 
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from organisms other than those to be detected may be present as background in a test 
sample. 

As used herein, the term "primer" refers to an oligonucleotide, whether occurring 
naturally as in a purified restriction digest or produced synthetically, which is capable of 
acting as a point of initiation of synthesis when placed under conditions in which synthesis of 
a primer extension product which is complementary to a nucleic acid strand is induced, (i.e., 
in the presence of nucleotides and an inducing agent such as DNA polymerase and at a 
suitable temperature and pH). The primer is preferably single stranded for maximum 
efficiency in amplification, but may alternatively be double stranded. If double stranded, the 
primer is first treated to separate its strands before being used to prepare extension 
products. Preferably, the primer is an oligodeoxyribonucleotide. The primer must be 
sufficiently long to prime the synthesis of extension products in the presence of the inducing 
agent. The exact lengths of the primers will depend on many factors, including temperature, 
source of primer and the use of the method. 

As used herein, the term "probe" refers to an oligonucleotide (i.e., a sequence of 
nucleotides), whether occurring naturally as in a purified restriction digest or produced 
synthetically, recombinantly or by PCR amplification, which is capable of hybridizing to 
another oligonucleotide of interest. A probe may be single-stranded or double-stranded. 
Probes are useful in the detection, identification and isolation of particular gene sequences. 
It is contemplated that any probe used in the present invention will be labeled with any 
"reporter molecule," so that is detectable in any detection system, including, but not limited 
to enzyme (e.g., ELISA, as well as enzyme-based histochemical assays), fluorescent, 
radioactive, and luminescent systems. It is not intended that the present invention be limited 
to any particular detection system or label. 

As used herein, the term "target," when used in reference to the polymerase chain 
reaction, refers to the region of nucleic acid bounded by the primers used for polymerase 
chain reaction. Thus, the "target" is sought to be sorted out from other nucleic acid 
sequences. A "segment" is defined as a region of nucleic acid within the target sequence. 

As used herein, the term "polymerase chain reaction" ("PCR") refers to the methods 
of U.S. Patent Nos. 4,683,195 4,683,202, and 4,965,188, hereby incorporated by reference, 
which include methods for increasing the concentration of a segment of a target sequence 
in a mixture of genomic DNA without cloning or purification. This process for amplifying the 
target sequence consists of introducing a large excess of two oligonucleotide primers to the 
DNA mixture containing the desired target sequence, followed by a precise sequence of 
thermal cycling in the presence of a DNA polymerase. The two primers are complementary 
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to their respective strands of the double stranded target sequence. To effect amplification, 
the mixture is denatured and the primers then annealed to their complementary sequences 
within the target molecule. Following annealing, the primers are extended with a 
polymerase so as to form a new pair of complementary strands. The steps of denaturation, 
primer annealing and polymerase extension can be repeated many times (i.e., denaturation, 
annealing and extension constitute one "cycle"; there can be numerous "cycles") to obtain a 
high concentration of an amplified segment of the desired target sequence. The length of 
the amplified segment of the desired target sequence is determined by the relative positions 
of the primers with respect to each other, and therefore, this length is a controllable 
parameter. By virtue of the repeating aspect of the process, the method is referred to as the 
"polymerase chain reaction" (hereinafter "PCR"). Because the desired amplified segments 
of the target sequence become the predominant sequences (in terms of concentration) in 
the mixture, they are said to be "PCR amplified". 

As used herein, the term "amplification reagents" refers to those reagents 
(deoxyribonucleotide triphosphates, buffer, etc.), needed for amplification except for 
primers, nucleic acid template and the amplification enzyme. Typically, amplification 
reagents along with other reaction components are placed and contained in a reaction 
vessel (test tube, microwell, etc.). 

With PCR, it is possible to amplify a single copy of a specific target sequence in 
genomic DNA to a level detectable by several different methodologies (e.g., hybridization 
with a labeled probe; incorporation of biotinylated primers followed by avidin-enzyme 
conjugate detection; incorporation of 32 P-labeled deoxynucleotide triphosphates, such as 
dCTP or dATP, into the amplified segment). In addition to genomic DNA, any 
oligonucleotide or polynucleotide sequence can be amplified with the appropriate set of 
primer molecules. In particular, the amplified segments created by the PCR process itself 
are, themselves, efficient templates for subsequent PCR amplifications. 

As used herein, the terms "PCR product," "PCR fragment," and "amplification 
product" refer to the resultant mixture of compounds after two or more cycles of the PCR 
steps of denaturation, annealing and extension are complete. These terms encompass the 
case where there has been amplification of one or more segments of one or more target 
sequences. 

As used herein, the term "RT-PCR" refers to the replication and amplification of RNA 
sequences. In this method, reverse transcription is coupled to PCR, most often using a one 
enzyme procedure in which a thermostable polymerase is employed, as described in U.S. 
Patent No. 5,322,770, herein incorporated by reference. In RT-PCR, the RNA template is 
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converted to cDNA due to the reverse transcriptase activity of the polymerase, and then 
amplified using the polymerizing activity of the polymerase (i.e., as in other PCR methods). 

As used herein, the terms "restriction endonucleases" and "restriction enzymes" 
refer to bacterial enzymes, each of which cut double-stranded DNA at or near a specific 
nucleotide sequence. 

A "restriction site" refers to a nucleotide sequence recognized and cleaved by a 
given restriction endonuclease and is frequently the site for insertion of DNA fragments. In 
certain embodiments of the invention restriction sites are engineered into the selective 
marker and into 5' and 3* ends of the DNA construct. 

As used herein, the term "chromosomal integration" refers to the process whereby 
an incoming sequence is introduced into the chromosome of a host cell. The homologous 
regions of the transforming DNA align with homologous regions of the chromosome. 
Subsequently, the sequence between the homology boxes is replaced by the incoming 
sequence in a double crossover (/.e., homologous recombination). In some embodiments 
of the present invention, homologous sections of an inactivating chromosomal segment of a 
DNA construct align with the franking homologous regions of the indigenous chromosomal 
region of the Bacillus chromosome. Subsequently, the indigenous chromosomal region is 
deleted by the DNA construct in a double crossover (/.e., homologous recombination). 

"Homologous recombination" means the exchange of DNA fragments between two 
DNA molecules or paired chromosomes at the site of identical or nearly identical nucleotide 
sequences. In a preferred embodiment, chromosomal integration is homologous 
recombination. 

"Homologous sequences" as used herein means a nucleic acid or polypeptide 
sequence having 100%, 99%, 98%, 97%, 96%, 95%, 94%, 93%, 92%, 91%, 90%, 88%, 
85%, 80%, 75%, or 70% sequence identity to another nucleic acid or polypeptide sequence 
when optimally aligned for comparison. In some embodiments, homologous sequences 
have between 85% and 100% sequence identity, while in other embodiments there is 
between 90% and 100% sequence identity, and in more preferred embodiments, there is 
95% and 100% sequence identity. 

As used herein "amino acid" refers to peptide or protein sequences or portions 
thereof. The terms "protein," "peptide," and "polypeptide" are used interchangeably. 

As used herein, "protein of interest" and "polypeptide of interest" refer to a 
protein/polypeptide that is desired and/or being assessed. In some embodiments, the 
protein of interest is expressed intracellular^, while in other embodiments, it is a secreted 
polypeptide. In particularly preferred embodiments, these enzyme include the serine 
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proteases of the present invention. In some embodiments, the protein of interest is a 
secreted polypeptide which is fused to a signal peptide (/.e., an amino-terminal extension on 
a protein to be secreted). Nearly all secreted proteins use an amino- terminal protein 
extension which plays a crucial role in the targeting to and translocation of precursor 
proteins across the membrane. This extension is proteolytically removed by a signal 
peptidase during or immediately following membrane transfer. 

As used herein, the term "heterologous protein" refers to a protein or polypeptide 
that does not naturally occur in the host cell. Examples of heterologous proteins include 
enzymes such as hydrolases including proteases. In some embodiments, the gene 
encoding the proteins are naturally occurring genes, while in other embodiments, mutated 
and/or synthetic genes are used. 

As used herein, "homologous protein" refers to a protein or polypeptide native or 
naturally occurring in a cell. In preferred embodiments, the cell is a Gram-positive cell, while 
in particularly preferred embodiments, the cell is a Bacillus host cell. In alternative 
embodiments, the homologous protein is a native protein produced by other organisms, 
including but not limited to E. coli, Streptomyces, Trichoderma, and Aspergillus. The 
invention encompasses host cells producing the homologous protein via recombinant DNA 
technology. 

As used herein, an "operon region" comprises a group of contiguous genes that are 
transcribed as a single transcription unit from a common promoter, and are thereby subject 
to co-regulation. In some embodiments, the operon includes a regulator gene. In most 
preferred embodiments, operons that are highly expressed as measured by RNA levels, but 
have an unknown or unnecessary function are used. 

As used herein, an "antimicrobial region" is a region containing at least one gene that 
encodes an antimicrobial protein. 

A polynucleotide is said to "encode" an RNA or a polypeptide if, in its native state or 
when manipulated by methods known to those of skill in the art, it can be transcribed and/or 
translated to produce the RNA, the polypeptide or a fragment thereof. The anti-sense 
strand of such a nucleic acid is also said to encode the sequences. 

As is known in the art, a DNA can be transcribed by an RNA polymerase to produce 
RNA, but an RNA can be reverse transcribed by reverse transcriptase to produce a DNA. 
Thus a DNA can encode a RNA and vice versa. 

The term "regulatory. segment" or "regulatory sequence" or "expression control 
sequence" refers to a polynucleotide sequence of DNA that is operatively linked with a 
polynucleotide sequence of DNA that encodes the amino acid sequence of a polypeptide 
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chain to effect the expression of the encoded amino acid sequence. The regulatory 
sequence can inhibit, repress, or promote the expression of the operably linked 
polynucleotide sequnce encoding the amino acid. 

"Host strain" or "host cell" refers to a suitable host for an expression vector 
comprising DNA according to the present invention. 

An enzyme is "overexpressed" in a host cell if the enzyme is expressed in the cell at 
a higher level that the level at which it is expressed in a corresponding wild-type cell. 

The terms "protein" and "polypeptide" are used interchangeability herein. The 3-letter 
code for amino acids as defined in conformity with the IUPAC-IUB Joint Commission on 
Biochemical Nomenclature (JCBN) is used through out this disclosure. It is also understood 
that a polypeptide may be coded for by more than one nucleotide sequence due to the 
degeneracy of the genetic code. 

A "prosequence" is an amino acid sequence between the signal sequence and 
mature protease that is necessary for the secretion of the protease. Cleavage of the pro 
sequence will result in a mature active protease. 

The term "signal sequence" or "signal peptide" refers to any sequence of nucleotides 
and/or amino acids which may participate in the secretion of the mature or precursor forms 
of the protein. This definition of signal sequence is a functional one, meant to include all 
those amino acid sequences encoded by the N-terminal portion of the protein gene, which 
participate in the effectuation of the secretion of protein. They are often, but not universally, 
bound to the N-terminal portion of a protein or to the N-terminal portion of a precursor 
protein. The signal sequence may be endogenous or exogenous. The signal sequence 
may be that normally associated with the protein (e.g., protease), or may be from a gene 
encoding another secreted protein. One exemplary exogenous signal sequence comprises 
the first seven amino acid residues of the signal sequence from Bacillus subtilis subtilisin 
fused to the remainder of the signal sequence of the subtilisin from Bacillus lentus (ATCC 
21536). 

The term "hybrid signal sequence" refers to signal sequences in which part of 
sequence is obtained from the expression host fused to the signal sequence of the gene to 
be expressed. In some embodiments, synthetic sequences are utilized. 

The term "substantially the same signal activity" refers to the signal activity, as 
indicated by substantially the same secretion of the protease into the fermentation medium, 
for example a fermentation medium protease level being at least 50%, at least 60%, at least 
70%, at least 80%, at least 90%, at least 95%, at least 98% of the secreted protease levels 
in the fermentation medium as provided by the signal sequence of SEQ ID NOS:5 and/or 9. 
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The term "mature" form of a protein or peptide refers to the final functional form of 
the protein or peptide. To exemply, a mature form of the protease of the present invention 
at least includes the amino acid sequence identical to residue positions 1-189 of SEQ ID 
NO:8. 

The term "precursor" form of a protein or peptide refers to a mature form of the 
protein having a prosequence operably linked to the amino or carbonyl terminus of the 
protein. The precursor may also have a "signal" sequence operably linked, to the amino 
terminus of the prosequence. The precursor may also have additional polynucleotides that 
are involved in post-translational activity (e.g., polynucleotides cleaved therefrom to leave 
the mature form of a protein or peptide). 

"Naturally occurring enzyme" refers to an enzyme having the unmodified amino acid 
sequence identical to that found in nature. Naturally occurring enzymes include native 
enzymes, those enzymes naturally expressed or found in the particular microorganism. 

The terms "derived from" and "obtained from" refer to not only a protease produced 
or producible by a strain of the organism in question, but also a protease encoded by a DNA 
sequence isolated from such strain and produced in a host organism containing such DNA 
sequence. Additionally, the term refers to a protease which is encoded by- a DNA sequence 
of synthetic and/or cDNA origin and which has the identifying characteristics of the protease 
in question. To exemplify, "proteases derived from Cellulomonas" refers to those enzymes 
having proteolytic activity which are naturally-produced by Cellulomonas, as well as to serine 
proteases like those produced by Cellulomonas sources but which through the use of 
genetic engineering techniques are produced by non-Cellulomonas organisms transformed 
with a nucleic acid encoding said serine proteases. 

A "derivative" within the scope of this definition generally retains the characteristic 
proteolytic activity observed in the wild-type, native or parent form to the extent that the 
derivative is useful for similar purposes as the wild-type, native or parent form. Functional 
derivatives of serine protease encompass naturally occurring, synthetically or recombinantly 
produced peptides or peptide fragments which have the general characteristics of the serine 
protease of the present invention. 

The term "functional derivative" refers to a derivative of a nucleic acid which has the 
functional characteristics of a nucleic acid which encodes serine protease. Functional 
derivatives of a nucleic acid which encode serine protease of the present invention 
encompass naturally occurring, synthetically or recombinantly produced nucleic acids or 
fragments and encode serine protease characteristic of the present invention. Wild type 
nucleic acid encoding serine proteases according to the invention include naturally occurring 
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alleles and homologues based on the degeneracy of the genetic code known in the art. 

The term "identical" in the context of two nucleic acids or polypeptide sequences 
refers to the residues in the two sequences that are the same when aligned for maximum 
correspondence, as measured using one of the following sequence comparison or analysis 
algorithms. 

The term "optimal alignment" refers to the alignment giving the highest percent 
identity score. 

"Percent sequence identity," "percent amino acid sequence identity," "percent gene 
sequence identity," and/or "percent nucleic acid/polynucloetide sequence identity," with 
respect to two amino acid, polynucleotide and/or gene sequences (as appropriate), refer to 
the percentage of residues that are identical in the two sequences when the sequences are 
optimally aligned. Thus, 80% amino acid sequence identity means that 80% of the amino 
acids in two optimally aligned polypeptide sequences are identical. 

The phrase "substantially identical" in the context of two nucleic acids or 
polypeptides thus refers to a polynucleotide or polypeptide that comprising at least 70% 
sequence identity, preferably at least 75%, preferably at least 80%, preferably at least 85%, 
preferably at least 90%, preferably at least 95% , preferably at least 97% , preferably at 
least 98% and preferably at least 99% sequence identity as compared to a reference 
sequence using the programs or algorithms (e.g., BLAST, ALIGN, CLUSTAL) using 
standard parameters. One indication that two polypeptides are substantially identical is that 
the first polypeptide is immunologically crossrreactive with the second polypeptide. 
Typically, polypeptides that differ by conservative amino acid substitutions are 
immunologically cross-reactive. Thus, a polypeptide is substantially identical to a second 
polypeptide, for example, where the two peptides differ only by a conservative substitution. 
Another indication that two nucleic acid sequences are substantially identical is that the two 
molecules hybridize to each other under stringent conditions (e.g., within a range of medium 
to high stringency). 

The phrase "equivalent," in this context, refers to serine proteases enzymes that are 
encoded by a polynucleotide capable of hybridizing to the polynucleotide having the 
sequence as shown in SEQ ID NO:1, under conditions of medium to maximal stringency. 
For example, being equivalent means that an equivalent mature serine protease comprises 
at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 
92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98% 
and/or at least 99% sequence identity to the mature Cellulomonas serine protease having 
the amino acid sequence of SEQ ID NO:8. 
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The term "isolated" or "purified" refers to a material that is removed from its original 
environment {e.g., the natural environment if it is naturally occurring). For example, the 
material is said to be "purified" when it is present in a particular composition in a higher or 
lower concentration than exists in a naturally occurring or wild type organism or in 
combination with components not normally present upon expression from a naturally 
occurring or wild type organism. For example, a naturally-occurring polynucleotide or 
polypeptide present in a living animal is not isolated, but the same polynucleotide or 
polypeptide, separated from some or all of the coexisting materials in the natural system, is 
isolated. Such polynucleotides could be part of a vector, and/or such polynucleotides or 
polypeptides could be part of a composition, and still be isolated in that such vector or 
composition is not part of its natural environment. In preferred embodiments, a nucleic acid 
or protein is said to be purified, for example, if it gives rise to essentially one band in an 
electrophoretic gel or blot. 

The term "isolated", when used in reference to a DNA sequence, refers to a DNA 
sequence that has been removed from its natural genetic milieu and is thus free of other 
extraneous or unwanted coding sequences, and is in a form suitable for use within 
genetically engineered protein production systems. Such isolated molecules are those that 
are separated from their natural environment and include cDNA and genomic clones. 
Isolated DNA molecules of the present invention are free of other genes with which they are 
ordinarily associated, but may include naturally occurring 5' and 3' untranslated regions such 
as promoters and terminators. The identification of associated regions will be evident to one 
of ordinary skill in the art (See e.g., Dynan and Tijan, Nature 316:774-78 [1985]). The term 
"an isolated DNA sequence" is alternatively referred to as "a cloned DNA sequence". 

The term "isolated," when used in reference to a protein, refers to a protein that is 
found in a condition other than its native environment. In a preferred form, the isolated 
protein is substantially free of other proteins, particularly other homologous proteins. An 
isolated protein is more than 10% pure, preferably more than 20% pure, and even more 
preferably more than 30% pure, as determined by SDS-PAGE. Further aspects of the 
invention encompass the protein in a highly purified form (i.e., more than 40% pure, more 
than 60% pure, more than 80% pure, more than 90% pure, more than 95% pure, more than 
97% pure, and even more than 99% pure), as determined by SDS-PAGE. 

As used herein, the term, "combinatorial mutagenesis" refers to methods in which 
libraries of variants of a starting sequence are generated. In these libraries, the variants 
contain one or several mutations chosen from a predefined set of mutations. In addition, the 
methods provide means to introduce random mutations which were not members of the 
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predefined set of mutations. In some embodiments, the methods include those set forth in 
U.S. Patent Appln. Ser. No. 09/699.250, filed October 26, 2000, hereby incorporated by 
reference. In alternative embodiments, combinatorial mutagenesis methods encompass 
commercially available kits {e.g., QuikChange® Multisite, Stratagene, San Diego, CA). 

s As used herein, the term "library of mutants" refers to a population of cells which are 

identical in most of their genome but include different homologues of one or more genes. 
Such libraries can be used, for example, to identify genes or operons with improved traits. 

As used herein, the term "starting gene" refers to a gene of interest that encodes a 
protein of interest that is to be improved and/or changed using the present invention. 

10 - As used herein, the term "multiple sequence alignment" ("MSA") refers to the 

sequences of multiple homologs of a starting gene that are aligned using an algorithm (e.g., 
Clustal W). 

As used herein, the terms "consensus sequence" and "canonical sequence" refer to 
an archetypical amino acid sequence against which all variants of a particular protein or 
15 sequence of interest are compared. The terms also refer to a sequence that sets forth the 
nucleotides that are most often present in a DNA sequence of interest. For each position of 
a gene, the consensus sequence gives the amino acid that is most abundant in that position 
in the MSA. 

As used herein, the term "consensus mutation" refers to a difference in the sequence 
20 of a starting gene and a consensus sequence. Consensus mutations are identified by 
comparing the sequences of the starting gene and the consensus sequence resulting from 
an MSA. In some embodiments, consensus mutations are introduced into the starting gene 
such that it becomes more similar to the consensus sequence. Consensus mutations also 
include amino acid changes that change an amino acid in a starting gene to an amino acid 
25 that is more frequently found in an MSA at that position relative to the frequency of that 
amino acid in the starting gene. Thus, the term consensus mutation comprises all single 
amino acid changes that replace an amino acid of the starting gene with an amino acid that 
is more abundant than the amino acid in the MSA. 

As used herein, the term "initial hif refers to a variant that was identified by 
30 screening a combinatorial consensus mutagenesis library. In preferred embodiments, initial 
hits have improved performance characteristics, as compared to the starting gene. 

As used herein, the term "improved hif refers to a variant that was identified by 
screening an enhanced combinatorial consensus mutagenesis library. 

As used herein, the terms "improving mutation" and "performance-enhancing 
35 mutation" refer to a mutation that leads to improved performance when it is introduced into 
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the starting gene. In some preferred embodiments, these mutations are identified by 
sequencing hits that were identified during the screening step of the method. In most 
embodiments, mutations that are more frequently found in hits are likely to be improving 
mutations, as compared to an unscreened combinatorial consensus mutagenesis library. 

As used herein, the term "enhanced combinatorial consensus mutagenesis library" 
refers to a CCM library that is designed and constructed based on screening and/or 
sequencing results from an earlier round of CCM mutagenesis and screening. In some 
embodiments, the enhanced CCM library is based on the sequence of an initial hit resulting 
from an earlier round of CCM. In additional embodiments, the enhanced CCM is designed 
such that mutations that were frequently observed in initial hits from earlier rounds of 
mutagenesis and screening are favored. In some preferred embodiments, this is 
accomplished by omitting primers that encode performance-reducing mutations or by 
increasing the concentration of primers that encode performance-enhancing mutations 
relative to other primers that were used in earlier CCM libraries. 

As used herein, the term "performance-reducing mutations" refer to mutations in the 
combinatorial consensus mutagenesis library that are less frequently found in hits resulting 
from screening as compared to an unscreened combinatorial consensus mutagenesis 
library. In preferred embodiments, the screening process removes and/or reduces the 
abundance of variants that contain "performance-reducing mutations." 

As used herein, the term "functional assay" refers to an assay that provides an 
indication of a protein's activity. In particularly preferred embodiments, the term refers to 
assay systems in which a protein is analyzed for its ability to function in its usual capacity. 
For example, in the case of enzymes, a functional assay involves determining the 
effectiveness of the enzyme in catalyzing a reaction. 

As used herein, the term "target property" refers to the property of the starting gene 
that is to be altered. It is not intended that the present invention be limited to any particular 
target property. However, in some preferred embodiments, the target property is the 
stability of a gene product (e.g., resistance to denaturation, proteolysis or other degradative 
factors), while in other embodiments, the level of production in a production host is altered. 
Indeed, it is contemplated that any property of a starting gene will find use in the present 
invention. 

The term "property" or grammatical equivalents thereof in the context of a nucleic 
acid, as used herein, refer to any characteristic or attribute of a nucleic acid that can be 
selected or detected. These properties include, but are not limited to, a property affecting 
binding to a polypeptide, a property conferred on a cell comprising a particular nucleic acid, 
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a property affecting gene transcription (e.g., promoter strength, promoter recognition, 
promoter regulation, enhancer function), a property affecting RNA processing (e.g., RNA 
splicing, RNA stability, RNA conformation, and post-transcriptional modification), a property 
affecting translation (e.g., level, regulation, binding of mRNA to ribosomal proteins, post- 
translations modification). For example, a binding site for a transcription factor, 
polymerase, regulatory factor, etc., of a nucleic acid may be altered to produce desired 
characteristics or to identify undesirable characteristics. 

The term "property" or grammatical equivalents thereof in the context of a 
polypeptide, as used herein, refer to any characteristic or attribute of a polypeptide that can 
be selected or detected. These properties include, but are not limited to oxidative stability, 
substrate specificity, catalytic activity, thermal stability, alkaline stability, pH activity profile, 
resistance to proteolytic degradation, Km, kcat, k ca t/k M ratio, protein folding, inducing an 
immune response, ability to bind to a ligand, ability to bind to a receptor, ability to be 
secreted, ability to be displayed on the surface of a cell, ability to oligomerize, ability to 
signal, ability to stimulate cell proliferation, ability to inhibit cell proliferation, ability to induce 
apoptosis, ability to be modified by phosphorylation or glycosylation, ability to treat disease. 

As used. herein, the term "screening" has its usual meaning in the aft and is, in 
general a multi-step process. In the first step, a mutant nucleic acid or variant polypeptide 
therefrom is provided. In the second step, a property of the mutant nucleic acid or variant 
polypeptide is determined. In the third step, the determined property is compared to a 
property of the corresponding precursor nucleic acid, to the property of the corresponding 
naturally occurring polypeptide or to the property of the starting material (e.g., the initial 
sequence) for the generation of the mutant nucleic acid. 

It will be apparent to the skilled artisan that the screening procedure for obtaining a 
nucleic acid or protein with an altered property depends upon the property of the starting 
material the modification of which the generation of the mutant nucleic acid is intended to 
facilitate. The skilled artisan will therefore appreciate that the invention is not limited to any 
specific property to be screened for and that the following description of properties lists 
illustrative examples only. Methods for screening for any particular property are generally 
described in the art. For example, one can measure binding, pH, specificity, etc., before 
and after mutation, wherein a change indicates an alteration. Preferably, the screens are 
performed in a high-throughput manner, including multiple samples being screened 
simultaneously, including, but not limited to assays utilizing chips, phage display, and 
multiple substrates and/or indicators. 

As used herein, in some embodiments, screens encompass selection steps in which 
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variants of interest are enriched from a population of variants. Examples of these 
embodiments include the selection of variants that confer a growth advantage to the host 
organism, as well as phage display or any other method of display, where variants can be 
captured from a population of variants based on their binding or catalytic properties. In a 
preferred embodiment, a library of variants is exposed to stress (heat, protease, 
denaturation) and subsequently variants that are still intact are identified in a screen or 
enriched by selection. It is intended that the term encompass any suitable means for 
selection. Indeed, it is not intended that the present invention be limited to any particular 
method of screening. 

As used herein, the term "targeted randomization" refers to a process that produces 
a plurality of sequences where one or several positions have been randomized. In some 
embodiments, randomization is complete (i.e., all four nucleotides, A, T, G, and C can occur 
at a randomized position. In alternative embodiments, randomization of a nucleotide is 
limited to a subset of the four nucleotides. Targeted randomization can be applied to one or 
several codons of a sequence, coding for one or several proteins of interest. When 
expressed, the resulting libraries produce protein populations in which one or more amino 
acid positions can contain a mixture of all 20 amino acids or a subset of amino acids, as 
determined by the randomization scheme of the randomized codon. In some embodiments, 
the individual members of a population resulting from targeted randomization differ in the 
number of amino acids, due to targeted or random insertion or deletion of codons. In further 
embodiments, synthetic amino acids are included in the protein populations produced. In 
some preferred embodiments, the majority of members of a population resulting from 
targeted randomization show greater sequence homology to the consensus sequence than 
the starting gene. In some embodiments, the sequence encodes one or more proteins fo 
interest. In alternative embodiments, the proteins have differing biological functions. In 
some preferred embodiments, the incoming sequence comprises at least one selectable 
marker. 

The terms "modified sequence" and "modified genes" are used interchangeably 
herein to refer to a sequence that includes a deletion, insertion or interruption of naturally 
occurring nucleic acid sequence. In some preferred embodiments, the expression product 
of the modified sequence is a truncated protein (e.g., if the modification is a deletion or 
interruption of the sequence). In some particularly preferred embodiments, the truncated 
protein retains biological activity. In alternative embodiments, the expression product of the 
modified sequence is an elongated protein (e.g., modifications comprising an insertion into 
the nucleic acid sequence). In some embodiments, an insertion leads to a truncated protein 
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(e.g. f when the insertion results in the formation of a stop codon). Thus, an insertion may 
result in either a truncated protein or an elongated protein as an expression product. 

As used herein, the terms "mutant sequence" and "mutant gene" are used 
interchangeably and refer to a sequence that has an alteration in at least one codon 
occurring in a host cell's wild-type sequence. The expression product of the mutant 
sequence is a protein with an altered amino acid sequence relative to the wild-type. The 
expression product may have an altered functional capacity (e.g., enhanced enzymatic 
activity). 

The terms "mutagenic primer" or "mutagenic oligonucleotide" (used interchangeably 
herein) are intended to refer to oligonucleotide compositions which correspond to a portion 
of the template sequence and which are capable of hybridizing thereto. With respect to 
mutagenic primers, the primer will not precisely match the template nucleic acid, the 
mismatch or mismatches in the primer being used to introduce the desired mutation into the 
nucleic acid library. As used herein, "non-mutagenic primer" or "non-mutagenic 
oligonucleotide" refers to oligonucleotide compositions which will match precisely to the 
template nucleic acid. In one embodiment of the invention, only mutagenic primers are 
used. In another preferred embodiment of the invention, the primers are designed so that 
for at least one region at which a mutagenic primer has been included, there is also non- 
mutagenic primer included in the oligonucleotide mixture. By adding a mixture of mutagenic 
primers and non-mutagenic primers corresponding to at least one of the mutagenic primers, 
it is possible to produce a resulting nucleic acid library in which a variety of combinatorial 
mutational patterns are presented. For example, if it is desired that some of the members of 
the mutant nucleic acid library retain their precursor sequence at certain positions while 
other members are mutant at such sites, the non-mutagenic primers provide the ability to 
obtain a specific level of non-mutant members within the nucleic acid library for a given 
residue. The methods of the invention employ mutagenic and non-mutagenic 
oligonucleotides which are generally between 10-50 bases in length, more preferably about 
15-45 bases in length. However, it may be necessary to use primers that are either shorter 
than 10 bases or longer than 50 bases to obtain the mutagenesis result desired. With 
respect to corresponding mutagenic and non-mutagenic primers, it is not necessary that the 
corresponding oligonucleotides be of identical length, but only that there is overlap in the 
region corresponding to the mutation to be added. 

Primers may be added in a pre-defined ratio according to the present invention. For 
example, if it is desired that the resulting library have a significant level of a certain specific 
mutation and a lesser amount of a different mutation at the same or different site, by 
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adjusting the amount of primer added, it is possible to produce the desired biased library. 
Alternatively, by adding lesser or greater amounts of non-mutagenic primers, it is possible to 
adjust the frequency with which the corresponding mutation(s) are produced in the mutant 
nucleic acid library. 

As used herein, the phrase "contiguous mutations" refers to mutations which are 
presented within the same oligonucleotide primer. For example, contiguous mutations may 
be adjacent or nearby each other, however, they will be introduced into the resulting mutant 
template nucleic acids by the same primer. 

As used herein, the phrase "discontiguous mutations" refers to mutations which are 
presented in separate oligonucleotide primers. For example, discontiguous mutations will 
be introduced into the resulting mutant template nucleic acids by separately prepared 
oligonucleotide primers. 

The terms "wild-type sequenpe," or "wild-type gene" are used interchangeably 
herein, to refer to a sequence that is native or naturally occurring in a host cell. In some 
embodiments, the wild-type sequence refers to a sequence of interest that is the starting 
point of a protein engineering project. The wild-type sequence may encode either a - - 
homologous or heterologous protein. A homologous protein is one the host cell would 
produce without intervention. A heterologous protein is one that the host cell would not 
produce but for the intervention. 

As used herein, the term "antibodies" refers to immunoglobulins. Antibodies include 
but are not limited to immunoglobulins obtained directly from any species from which it is 
desirable to produce antibodies. In addition, the present invention encompasses modified 
antibodies. The term also refers to antibody fragments that retain the ability to bind to the 
epitope that the intact antibody binds and include polyclonal antibodies, monoclonal 
antibodies, chimeric antibodies, anti-idiotype (anti-ID) antibodies. Antibody fragments 
include, but are not limited to the complementarity-determining regions (CDRs), single-chain 
fragment variable regions (scFv), heavy chain variable region (VH), light chain variable 
region (VL). Polyclonal and monoclonal antibodies are also encompassed by the present 
invention. Preferably, the antibodies are monoclonal antibodies. 

' The term "oxidation stable" refers to proteases of the present invention that retain a 
specified amount of enzymatic activity over a given period of time under conditions 
prevailing during the proteolytic, hydrolyzing, cleaning or other process of the invention, for 
example while exposed to or contacted with bleaching agents or oxidizing agents. In some 
embodiments, the proteases retain at least 50%, 60%, 70%, 75%, 80%, 85%, 90%, 92%, 
95%, 96%, 97%, 98% or 99% proteolytic activity after contact with a bleaching or oxidizing 
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agent over a given time period, for example, at least 1 minute, 3 minutes, 5 minutes, 8 
. minutes, 12 minutes, 16 minutes, 20 minutes, etc. In some embodiments, the stability is 
measured as described in the Examples. 

The term "chelator stable" refers to proteases of the present invention that retain a 
specified amount of enzymatic activity over a given period of time under conditions 
prevailing during the proteolytic, hydrolyzing, cleaning or other process of the invention, for 
example while exposed to or contacted with chelating agents. In some embodiments, the 
proteases retain at least 50%, 60%, 70%, 75%, 80%, 85%, 90%, 92%, 95%, 96%, 97%, 
98% or 99% proteolytic activity after contact with a chelating agent over a given time period, 
for example, at least 10 minutes, 20 minutes, 40 minutes, 60 minutes, 100 minutes, etc. In 
some embodiments, the chelator stability is measured as described in the Examples. 

The terms "thermally stable" and "thermostable" refer to proteases of the present 
invention that retain a specified amount of enzymatic activity after exposure to identified 
temperatures over a given period of time under conditions prevailing during the proteolytic, 
hydrolyzing, cleaning or other process of the invention, for example while exposed altered 
temperatures. Altered temperatures includes increased or decreased temperatures: In 
some embodiments, the proteases retain at least 50%, 60%, 70%, 75%, 80%, 85%, 90%, 
92%, 95%, 96%, 97%, 98% or 99% proteolytic activity after exposure to altered 
temperatures over a given time period, for example, at least 60 minutes, 120 minutes, 180 
minutes, 240 minutes, 300 minutes, etc. In some embodiments, the thermostability is 
determined as described in the Examples. 

The term "enhanced stability" in the context of an oxidation, chelator, thermal and/or 
pH stable protease refers to a higher retained proteolytic activity over time as compared to 
other serine proteases (e.g., subtilisin proteases) and/or wild-type enzymes. 

The term "diminished stability" in the context of an oxidation, chelator, thermal and/or 
pH stable protease refers to a lower retained proteolytic activity over time as compared to 
other serine proteases (e.g., subtilisin proteases) and/or wild-type enzymes. 

As used herein, the term "cleaning composition" includes, unless otherwise 
indicated, granular or powder-form all-purpose or "heavy-duty" washing agents, especially 
cleaning detergents; liquid, gel or paste-form all-purpose washing agents, especially the so- 
called heavy-duty liquid types; liquid fine-fabric detergents; hand dishwashing agents or light 
duty dishwashing agents, especially those of the high-foaming type; machine dishwashing 
agents, including the various tablet, granular, liquid and rinse-aid types for household and 
institutional use; liquid cleaning and disinfecting agents, including antibacterial hand-wash 
types, cleaning bars, mouthwashes, denture cleaners, car or carpet shampoos, bathroom 
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cleaners; hair shampoos and hair-rinses; shower gels and foam baths and metal cleaners; 
as well as cleaning auxiliaries such as bleach additives and "stain-stick" or pre-treat types. 

It is to be understood that the test methods described in the Examples herein are 
used to determine the respective values of the parameters of the present invention, as such 
invention is described and claimed herein. 

Unless otherwise noted, all component or composition levels are in reference to the 
active level of that component or composition, and are exclusive of impurities, for example, 
residual solvents or by-products, which may be present in commercially available sources. 

Enzyme components weights are based on total active protein. 

All percentages and ratios are calculated by weight unless otherwise indicated. All 
percentages and ratios are calculated based on the total composition unless otherwise 
indicated. 

It should be understood that every maximum numerical limitation given throughout 
this specification includes every lower numerical limitation, as if such lower numerical 
limitations were expressly written herein. Every minimum numerical limitation given 
throughout this specification will include every higher numerical limitation, as if such higher 
numerical limitations were expressly written herein. Every numerical range given throughout 
this specification will include every narrower numerical range that falls within such broader 
numerical range, as if such narrower numerical ranges were all expressly written herein. 

The term "cleaning activity" refers to the cleaning performance achieved by the 
protease under conditions prevailing during the proteolytic, hydrolyzing, cleaning or other 
process of the invention. In some embodiments, cleaning performance is determined by the 
application of various cleaning assays concerning enzyme sensitive stains, for example 
grass, blood, milk, or egg protein as determined by various chromatographic, 
spectrophotometric or other quantitative methodologies after subjection of the stains to 
standard wash conditions. Exemplary assays include, but are not limited to those described 
in WO 99/3401 1, and U.S. Pat. 6,605,458 (both of which are herein incorporated by 
reference), as well as those methods included in the Examples. 

The term "cleaning effective amount" of a protease refers to the quantity of protease 
described hereinbefore that achieves a desired level of enzymatic activity in a specific 
cleaning composition. Such effective amounts are readily ascertained by one of ordinary 
skill in the art and are based on many factors, such as the particular protease used, the 
cleaning application, the specific composition of the cleaning composition, and whether a 
liquid or dry (e.g., granular, bar) composition is required, etc. 

The term "cleaning adjunct materials," as used herein, means any liquid, solid or 
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gaseous material selected for the particular type of cleaning composition desired and the 
form of the product (e.g., liquid, granule, powder, bar, paste, spray, tablet, gel; or foam 
composition), which materials are also preferably compatible with the protease enzyme used 
in the composition. In some embodiments, granular compositions are in "compact" form, 
while in other embodiments, the liquid compositions are in a "concentrated" form. 

The term "enhanced performance" in the context of cleaning activity refers to an 
increased or greater cleaning activity of certain enzyme sensitive stains such as egg, milk, 
grass or blood, as determined by usual evaluation after a standard wash cycle and/or 
multiple wash cycles. 

The term "diminished performance" in the context of cleaning activity refers to an 
decreased or lesser cleaning activity of certain enzyme sensitive stains such as egg, milk, 
grass or blood, as determined by usual evaluation after a standard wash cycle. 

The term "comparative performance" in the context of cleaning activity refers to at 
least 60%, at least 70%, at least 80% at least 90% at least 95% of the cleaning activity of a 
comparative subtilisin protease (e.g., commercially available proteases), including but not 
limited to OPTIMASE™ protease (Genencor), PURAFECT ™ protease products 
(Genencor), SAVINASE ™ protease (Novozymes), BPN'-variants (See e.g., U.S. Pat. No. 
Re 34,606), RELASE™, DURAZYME™, EVERLASE™, KANNASE ™ protease 
(Novozymes), MAXACAL™, MAXAPEM™, PROPERASE ™ proteases (Genencor; See 
also, U.S. Pat. No. Re 34,606, U.S. Pat. Nos. 5,700,676; 5,955,340; 6,312,936; 6,482,628), 
and B. lentus variant protease products [for example those described in WO 92/21760, WO 
95/23221 and/or WO 97/07770 (Henkel). Exemplary subtilisin protease variants include, but 
are not limited to those having substitutions or deletions at residue positions equivalent to 
positions 76, 101, 103, 104, 120, 159, 167, 170, 194, 195, 217, 232, 235, 236, 245, 248, 
and/or 252 of BPN\ Cleaning performance can be determined by comparing the proteases 
of the present invention with those subtilisin proteases in various cleaning assays 
concerning enzyme sensitive stains such as grass, blood or milk as determined by usual 
spectrophotometric or analytical methodologies after standard wash cycle conditions. 

As used herein, a "low detergent concentration" system includes detergents where 
less than about 800 ppm of detergent components are present in the wash water. 
Japanese detergents are typically considered low detergent concentration systems, as they 
have usually have approximately 667 ppm of detergent components present in the wash 
water. 

As used herein, a "medium detergent concentration" systems includes detergents 
wherein between about 800 ppm and about 2000ppm of detergent components are present 
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in the wash water. North American detergents are generally considered to be medium 
detergent concentration systems as they have usually approximately 975 ppm of detergent 
components present in the wash water. Brazilian detergents typically have approximately 
1500 ppm of detergent components present in the wash water. 

As used herein, "high detergent concentration" systems includes detergents wherein 
greater than about 2000 ppm of detergent components are present in the wash water. 
European detergents are generally considered to be high detergent concentration systems 
as they have approximately 3000-8000 ppm of detergent components in the wash water. 

As used herein, "fabric cleaning compositions" include hand and machine laundry 
detergent compositions including laundry additive compositions and compositions suitable 
for use in the soaking and/or pretreatment of stained fabrics (e.g., clothes, linens, and other 
textile materials). 

As used herein, "non-fabric cleaning compositions" include non-textile (i.e., fabric) 
surface cleaning compositions, including but not limited to dishwashing detergent 
compositions, oral cleaning compositions, denture cleaning compositions, and personal 
cleansing compositions. 

The "compact" form of the cleaning compositions herein is best reflected by density 
and, in terms of composition, by the amount of inorganic filler salt. Inorganic filler salts are 
conventional ingredients of detergent compositions in powder form. In conventional 
detergent compositions, the filler salts are present in substantial amounts, typically 17-35% 
by weight of the total composition. In contrast, in compact compositions, the filler salt is 
present in amounts not exceeding 15% of the total composition. In some embodiments, the 
filler salt is present in amounts that do not exceed 10%, or more preferably, 5%, by weight 
of the composition. In some embodiments, the inorganic filler salts are selected from the 
alkali and alkaline-earth-metal salts of sulfates and chlorides. A preferred filler salt is 
sodium sulfate. 

II. Serine Protease Enzymes and Nucleic Acid Encoding Serine Protease 
Enzymes 

The present invention provides isolated polynucleotides encoding amino acid 
sequences, encoding proteases. In some embodiments, these polynucleotides comprise at 
least 65% amino acid sequence identity, preferably at least 70% amino acid sequence 
identity, more preferably at least 75% amino acid sequence identity, still more preferably at 
least 80% amino acid sequence identity, more preferably at least 85% amino acid sequence 
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identity, even more preferably at least 90% amino acid sequence identity, more preferably at 
least 92% amino acid sequence identity, yet more preferably at least 95% amino acid 
sequence identity, more preferably at least 97% amino acid sequence identity, still more 
preferably at least 98% amino acid sequence identity, and most preferably at least 99% 
amino acid sequence identity to an amino acid sequence as shown in SEQ ID NOS:6-8, 
(e.g., at least a portion of the amino acid sequence encoded by the polynucleotide having 
proteolytic activity, including the mature protease catalyzing the hydrolysis of peptide 
linkages of substrates), and/or demonstrating comparable or enhanced washing 
performance under identified wash conditions. 

In some embodiments, the percent identity (amino acid sequence, nucleic acid 
sequence, gene sequence) is determined by a direct comparison of the sequence 
information between two molecules by aligning the sequences, counting the exact number 
of matches between the two aligned sequences, dividing by the length of the shorter 
sequence, and multiplying the result by 100. Readily available computer programs find use 
in these analysis, such as those described above. Programs for determining nucleotide 
sequence identity are available in the Wisconsin Sequence Analysis Package, Version 8 
(Genetics Computer Group, Madison, Wl) for example, the BESTFIT, FASTA and GAP 
programs, which also rely on the Smith and Waterman algorithm. These programs are 
readily utilized with the default parameters recommended by the manufacturer and 
described in the Wisconsin Sequence Analysis Package referred to above. 

An example of an algorithm that is suitable for determining sequence similarity is the 
BLAST algorithm, which is described in Altschul, etaL, J. Mol. Biol., 215:403-410 (1990). 
Software for performing BLAST analyses is publicly available through the National Center 
for Biotechnology Information (http://www.ncbi.nlm.nih.gov/). This algorithm involves first 
identifying high scoring sequence pairs (HSPs) by identifying short words of length W in the 
query sequence that either match or satisfy some positive-valued threshold score T when 
aligned with a word of the same length in a database sequence. These initial neighborhood 
word hits act as starting points to find longer HSPs containing them. The word hits are 
expanded in both directions along each of the two sequences being compared for as far as 
the cumulative alignment score can be increased. Extension of the word hits is stopped 
when: the cumulative alignment score falls off by the quantity X from a maximum achieved 
value; the cumulative score goes to zero or below; or the end of either sequence is reached. 
The BLAST algorithm parameters W, T, and X determine the sensitivity and speed of the 
alignment. The BLAST program uses as defaults a wordlength (W) of 1 1 , the BLOSUM62 
scoring matrix (See, Henikoff & Henikoff, Proc. Natl. Acad. Sci. USA 59:10915 (1989)) 
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alignments (B) of 50, expectation (E) of 10, M'5, N'-4, and a comparison of both strands. 

The BLAST algorithm then performs a statistical analysis of the similarity between, 
two sequences (See e.g., Karlin and Altschui, Proc. Natl Acad. Sci. USA 90:5873-5787 
[1993]). One measure of similarity provided by the BLAST algorithm is the smallest sum 
probability (P(N)), which provides an indication of the probability by which a match between 
two nucleotide or amino acid sequences would occur by chance. For example, a nucleic 
acid is considered similar to a serine protease nucleic acid of this invention if the smallest 
sum probability in a comparison of the test nucleic acid to a serine protease nucleic acid is 
less than about 0.1, more preferably less than about 0.01, and most preferably less than 
about 0.001. Where the test nucleic acid encodes a serine protease polypeptide, it is 
considered similar to a specified serine protease nucleic acid if the comparison results in a 
smallest sum probability of less than about 0.5, and more preferably less than about 0.2. 

In some embodiments of the present invention, sequences were analyzed by BLAST 
and protein translation sequence tools. In some experiments, the preferred version was 
BLAST (Basic BLAST version 2.0). The program'chosen was "BlastX", and the database 
chosen was "nr". Standard/default parameter values were* employed. 

In some preferred embodiments, the present invention encompasses the • • 
approximately 1621 base pairs in length polynucleotide set forth in SEQ. ID NO:1 ; A start 
codon is shown in bold in SEQ ID NO:1. In another embodiment of the present invention, 
the polynucleotides encoding these amino acid sequences comprise a 1485 base pair 
portion (residues 1-1485 of SEQ ID NO:2) that, if expressed, is believed to encode a signal 
sequence (nucleotides 1-84 of SEQ ID NO:5) encoding amino acids 1-28 of SEQ ID NO:9; 
an N-terminal prosequence (nucleotides 84-594 encoding amino acid residues 29-198 of 
SEQ ID NO:6); a mature protease sequence (nucleotides 595-1 161 of SEQ ID NO:2 
encoding amino acid residues 1-189 of SEQ ID NO:8); and a C-terminal pro-sequence 
(nucleotides 1162-1486 encoding amino acid residues 388-495 of SEQ ID NO:6). 
Alternatively, the signal peptide, the N-terminal pro-sequence, mature serine protease 
sequence and C-terminal pro-sequence is numbered in relation to the amino acid residues 
of the mature protease of SEQ ID NO:6 being numbered 1-189, i.e., signal peptide (residues 
-198 to -171 ), an N-terminal pro sequence (residues : 171 to -1), the mature serine 
protease sequence (residues 1-189) and a C-terminal pro-sequence (residues 190-298). In 
another embodiment of the present invention, the polynucleotide encoding an amino acid 
sequence having proteolytic activity comprises a nucleotide sequence of nucleotides 1 to 
1485 of the portion of SEQ ID NO:2 encoding the signal peptide and precursor protease. In 
another embodiment of the present invention, the polynucleotide encoding an amino acid 
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sequence comprises the sequence of nucleotides 1 to 1412 of the polynucleotide encoding 
the precursor Cellulomonas protease (SEQ ID NO:3). In yet another embodiment, the 
polynucleotide encoding an amino acid sequence comprises the sequence of nucleotides 1 
to 587 of the portion of the polynucleotide encoding the mature Cellulomonas protease 
(SEQ ID NO:4). 

As will be understood by the skilled artisan, due to the degeneracy of the genetic 
code, a variety of polynucleotides can encode the signal peptide, precursor protease and/or 
mature protease provided in SEQ ID NOS:6, 7, and/or 8, respectively, or a protease having 
the % sequence identity described above. Another embodiment of the present invention 
encompasses a polynucleotide comprising a nucleotide sequence having at least 70% 
sequence identity, at least 75% sequence identity, at least 80% sequence identity, at least 
85% sequence identity, at least 90% sequence identity, at least 92% sequence identity, at 
least 95% sequence identity, at least 97% sequence identity, at least 98% sequence identity 
and at least 99% sequence identity to the polynucleotide sequence of SEQ ID NOS:2, 3, 
and/or 4, respectively, encoding the signal peptide and precursor protease, the precursor 
protease and/or the mature protease, respectively. 

In additional embodiments, the present invention provides fragments or portions of 
DNA that encodes proteases, so long as the encoded fragment retains proteolytic activity. 
Another embodiment of the present invention encompasses polynucleotides having at least 
20% of the sequence length, at least 30% of the sequence length, at least 40% of the 
sequence length, at least 50% of the sequence length, at least 60% of the sequence length, 
70% of the sequence length, at least 75% of the sequence length, at least 80% of the 
sequence length, at least 85% of the sequence length, at least 90% of the sequence length, 
at least 92% of the sequence length, at least 95% of the sequence length, at least 97% of 
the sequence length, at least 98% of the sequence length and at least 99% of the sequence 
of the polynucleotide sequence of SEQ ID NO:2, or residues 185-1672 of SEQ ID NO:1, 
encoding the precursor protease. In alternative embodiments, these fragments or portions 
of the sequence length are contiguous portions of the sequence length, useful for shuffling 
of the DNA sequence in recombinant DNA sequences (See e.g., U.S. Pat. No. 6,132,970) 

Another embodiment of the invention includes fragments of the DNA described 
herein that find use according to art recognized techniques in obtaining partial length DNA 
fragments capable of being used to isolate or identify polynucleotides encoding mature 
protease enzyme described herein from Cellulomonas 69B4, or a segment thereof having 
proteolytic activity. Moreover, the DNA provided in SEQ ID NO:1 finds use in identifying 
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homologous fragments of DNA from other species, and particularly from Cellulomonas spp. 
which encode a protease or portion thereof having proteolytic activity. 

In addition, the present invention encompasses using primer or probe sequences 
constructed from SEQ ID NO:1 , or a suitable portion or fragment thereof (e.g., at least about 
5-20 or 10-15 contiguous nucleotides), as a probe or primer for screening nucleic acid of 
either genomic or cDNA origin. In some embodiments, the present invention provides DNA 
probes of the desired length (i.e., generally between 100 and 1000 bases in length), based 
on the sequences in SEQ ID NOS1, 2, 3, and/or 4. 

In some embodiments, the DNA fragments are electrophoretically isolated, cut from 
the gel, and recovered from the agar matrix of the gel. In preferred embodiments, this 
purified fragment of DNA is then labeled (using, for example, the Megaprime labeling 
system according to the instructions of the manufacturer) to incorporate P 32 in the DNA. 
The labeled probe is denatured by heating to 95°C for a given period of time (e.g., 5 
minutes), and immediately added to the membrane and prehybridization solution. The 
hybridization reaction proceeds for an appropriate time and under appropriate conditions 
(e.g., 18 hours at 37 e C), with gentle shaking or rotation. The membrane is rinsed (e.g., 
twice in SSC/0.3% SDS) and then washed in an appropriate wash solution with gentle 
agitation. The stringency desired is a reflection of the conditions under which the 
membrane (filter) is washed. In some embodiments herein, "low-stringency" conditions 
involve washing with a solution of 0.2X SSC/0.1% SDS at 20°C for 15 minutes, while in • 
other embodiments, "medium-stringency" conditions, involve a further washing step 
comprising washing with a solution of 0.2X SSC/0.1% SDS at 37°C for 30 minutes, while in 
other embodiments, "high-stringency" conditions involve a further washing step comprising 
washing with a solution of 0.2X SSC/0.1% SDS at 37°C for 45 minutes, and in further 
embodiments, "maximum-stringency" conditions involve a further washing step comprising 
washing with a solution of 0.2X SSC/0.1% SDS at 37°C for 60 minutes. Thus, various 
embodiments of the present invention provide polynucleotides capable of hybridizing to a 
probed derived from the nucleotide sequence provided in SEQ ID NOS:1, 2, 3, 4, and/or 5, 
under conditions of medium, high and/or maximum stringency. 

After washing, the membrane is dried and the bound probe detected. If P 32 or 
another radioisotope is used as the labeling agent, the bound probe is detected by 
autoradiography. Other techniques for the visualization of other probes are well-known to 
those of skill in the art. The detection of a bound probe indicates a nucleic acid sequence 
has the desired homology, and therefore identity to SEQ ID NOS:1 , 2, 3, 4, and/or 5, and is 
encompassed by the present invention. Accordingly, the present invention provides 
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methods for the detection of nucleic acid encoding a protease encompassed by the present 
invention which comprises hybridizing part or all of a nucleic acid sequence of SEQ ID 
NOS:1, 2, 3, 4, and/or 5 with other nucleic acid of either genomic or cDNA origin. 

As indicated above, in other embodiments, hybridization conditions are based on the 
melting temperature (Tm) of the nucleic acid binding complex, to confer a defined 
"stringency" as explained below. "Maximum stringency" typically occurs at about Tm-5°C 
(5°C below the Tm of the probe); "high stringency" at about 5°C to 10°C below Tm; 
"intermediate stringency" at about 10°C to 20°C below Tm; and "low stringency" at about 20° 
C to 25*C below Tm. As known to those of skill in the art, medium, high and/or maximum 
stringency hybridization are chosen such that conditions are optimized to identify or detect 
polynucleotide sequence homologues or equivalent polynucleotide sequences. 

In yet additional embodiments, the present invention provides nucleic acid constructs 
(i.e., expression vectors) comprising the polynucleotides encoding the proteases of the 
present invention. In further embodiments, the present invention provides host cells 
transformed with at least one of these vectors. 

In . further embodiments, the present invention provides polynucleotide sequences 
further encoding a signal sequence. In some embodiments, invention encompasses 
polynucleotides having signal activity comprising a nucleotide sequence having at least 65% 
sequence identity, at least 70% sequence identity, preferably at least 75% sequence 
identity, more preferably at least 80% sequence identity, still further preferably at least 85% 
sequence identity, even more preferably at least 90% sequence identity, more preferably at 
least 95% sequence identity, more preferably at least 97% sequence identity, at least 98% 
sequence identity, and most preferably at least 99% sequence identity to SEQ ID NO:5. 
Thus, in these embodiments, the present invention provides a sequence with a putative 
signal sequence, and polynucleotides being capable of hybridizing to a probe derived from 
the nucleotide sequence disclosed in SEQ ID NO:5 under conditions of medium, high and/or 
maximal stringency, wherein the signal sequences have substantially the same signal 
activity as the signal sequence encoded by the polynucleotide of the present invention. 

In some embodiments, the signal activity is indicated by substantially the same level 
of secretion of the protease into the fermentation medium, as the starting material. For 
example, in some embodiments, the present invention provides fermentation medium 
protease levels at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 
95%, or at least 98% of the secreted protease levels in the fermentation medium as 
provided by the signal sequence of SEQ ID NO:3. In some embodiments, the secreted 
protease levels are ascertained by protease activity analyses such as the pNA assay (See 
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e.gr., Del Mar, [1979], infra). Additional means for determining the levels of secretion of a 
heterologous or homologous protein in a Gram-positive host cell and detecting secreted 
proteins include using either polyclonal or monoclonal antibodies specific for the protein. 
Examples include enzyme-linked immunosorbent assay (ELISA), radioimmunoassay (RIA) 
and fluorescent activated cell sorting (FACS), as well-known those in the art. 

In further embodiments, the present invention provides polynucleotides, encoding an 
amino acid sequence of a signal peptide (nucleotides 1-84 of SEQ ID NO:5), as shown in 
SEQ ID NO:9, nucleotide residue positions 1 to 85 of SEQ ID NO:2, and /or SEQ ID NO:5. 
The invention further encompasses nucleic acid sequences which hybridize to the nucleic 
acid sequence shown in SEQ ID NO:5 under low, medium, high stringency and/or maximum 
stringency conditions, but which have substantially the same signal activity as the sequence. 
The present invention encompasses all such polynucleotides. 

In further embodiments, the. present invention provides polynucleotides that are 
complementary to the nucleotide sequences described herein. Exemplary complementary 
nucleotide sequences include those that are provided in SEQ ID NOS:1-5. 

Further aspects of the present invention encompass polypeptides having proteolytic 
activity comprising 65% amino acid sequence identity, at least 70% sequence identity, at 
least 75% amino acid sequence identity, at least 80% amino acid sequence identity, at least 
85% amino acid sequence identity, at least 90% amino acid sequence identity, at least 92% 
amino acid sequence identity, at least 95% amino acid sequence identity, at least 97% 
amino acid sequence identity, at least 98% amino acid sequence identity and at least 99% 
amino acid sequence identity to the amino acid sequence of SEQ ID NO: 6 (i.e., the signal 
and precursor protease), SEQ ID NO:7 (i.e., the precursor protease), and/or of SEQ ID 
NO:8 (i.e., the mature protease). The proteolytic activity of these polypeptides is determined 
using methods known in the art and include such methods as those used to assess 
detergent function. In further embodiments, the polypeptides are isolated. In additional 
embodiments of the present invention, the polypeptides comprise amino acid sequences 
that identical to amino acid sequence selected from the group consisting of the amino acid 
sequences of SEQ ID NOS:6, 7, or 8. In some further embodiments, the polypeptides are 
identical to portions of SEQ ID NOS:6, 7 or 8. 

In some embodiments, the present invention provides isolated polypeptides having 
proteolytic activity, comprising the amino acid sequence approximately 495 amino acids in 
length, as provided in SEQ ID NO:6. In further embodiments, the present invention 
encompasses polypeptides having proteolytic activity comprising the amino acid sequence 
approximately 467 amino acids in length provided in SEQ ID NO:7. In some embodiments, 
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these amino acid sequences comprise a signal sequence (amino acids 1-28 of SEQ ID 
NO:9); and a precursor protease (amino acids 1-467 of SEQ ID NO:7). In additional 
embodiments, the present invention encompasses polypeptides comprising an N-terminal 
prosequence (amino acids 1-170 of SEQ ID NO:7), a mature protease sequence (amino 
acids 1-189 of SEQ ID NO:8), and a C-terminal prosequence (amino acids 360 -467 of SEQ 
ID NO:7). In still further embodiments, the present invention encompasses polypeptides 
comprising a precursor protease sequence (e.g., amino acids 1-467 of SEQ ID NO:7). In 
yet another embodiment, the present invention encompasses polypeptides comprising a 
mature protease sequence comprising amino acids (e.g., 1-189 of SEQ ID NO:8). 

In further embodiments, the present invention provides polypeptides and/or • 
proteases comprising amino acid sequences of the above described sequence derived from 
bacterial species including, but not limited to Micrococcineae which are identified through 
amino acid sequence homology studies. In some embodiments, an amino acid residue of a 
precursor Micrococcineae protease is equivalent to a residue of Cellulomonas strain 69B4, if 
it is either homologous (i.e., corresponding in position in either primary or tertiary structure) 
or analogous to a specific residue or portion of that residue in Cellulomonas strain 69B4 
protease (i.e., having the same or similar functional capacity to combine, react, or interact 
chemically). 

In some preferred embodiments, in order to establish homology to primary structure, 
the amino acid sequence of a precursor protease is directly compared to the Cellulomonas 
strain 69B4 mature protease amino acid sequence and particularly to a set of conserved 
residues which are discerned to be invariant in all or a large majority of Cellulomonas like 
proteases for which sequence is known. After aligning the conserved residues, allowing for 
necessary insertions and deletions in order to maintain alignment (i.e., avoiding the 
elimination of conserved residues through arbitrary deletion and insertion), the residues 
corresponding to particular amino acids in the mature protease (SEQ ID NO:8) and 
Cellulomonas 69B4 protease are determined. Alignment of conserved residues preferably 
should conserve 100% of such residues. However, alignment of greater than 75% or as 
little as 45% of conserved residues is also adequate to define equivalent residues. 
However, conservation of the catalytic triad, His32/Asp56/Ser137 of SEQ ID NO:8 should be 
maintained. 

For example, in some embodiments, the amino acid sequence of proteases from 
Cellulomonas strain 69B4, and other Micrococcineae spp. described above are aligned to 
provide the maximum amount of homology between amino acid sequences. A comparison 
of these sequences indicates that there are a number of conserved residues contained in 
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each sequence. These are the residues that are identified and utilized to establish the 
equivalent residue positions of amino acids identified in the precursor or mature 
Micrococcineae protease in question. 

These conserved residues are used to ascertain the corresponding amino acid 
residues of Cellulomonas strain 69B4 protease in one or more in Micrococcineae 
homologues (e.g., Cellulomonas cellasea (DSM 20118) and/or a Cellulomonas homologue 
herein). These particular amino acid sequences are aligned with the sequence of 
Cellulomonas 69B4 protease to produce the maximum homology of conserved residues. By 
this alignment, the sequences and particular residue positions of Cellulomonas 69B4 are 
observed in comparison with other Cellulomonas spp. Thus, the equivalent amino acid for 
the catalytic triad (e.g., in Cellulomonas 69B4 protease) is identifiable in the other 
Micrococcineae spp. In some embodiments of the present invention, the protease 
homologs comprise the equivalent of His32/Asp56/Ser137 of SEQ ID NO:8. 

Another indication that two polypeptides are substantially identical is that the first 
polypeptide is immunologically cross-reactive with the second polypeptide. Methodologies 
for determining immunological cross-reactivity are described in the art and are described in 
the Examples herein. Typically, polypeptides that differ by conservative amino acid 
substitutions are immunologically cross-reactive. Thus, a polypeptide is substantially 
identical to a second polypeptide, for example, where the two peptides differ only by a 
conservative substitution. 

The present invention encompasses proteases obtained from various sources. In 
some preferred embodiments, the proteases are obtained from bacteria, while in other 
embodiments, the proteases are obtained from fungi. 

In some particularly preferred embodiments, the bacterial source is selected from the 
members of the suborder Micrococcineae. In some embodiments, the bacterial source is 
the family Promicromonosporaceae. In some preferred embodiments, the 
Promicromonosporaceae spp. includes and/or is selected from the group consisting of 
Promicromonospora citrea (DSM 431 10), Promicromonospora sukumoe (DSM 44121), 
Promicromonospora aerolata (CCM 7043), Promicromonospora vindobonensis (CCM 7044), 
Myceligenerans xiligouense (DSM 15700), Isoptericola variabilis (DSM 10177, basonym 
Cellulosimicrobium variabile), Cellulosimicrobium cellulans (DSM 20424, basonym Nocardia 
cellulans, Cellulomonas cellulans), Cellulosimicrobium funkei, Xylanimonas cellulosilytica 
(LMG 20990), Xylanibacterium ulmi (LMG 21721), and Xylanimicrobium pachnodae (DSM 
12657, basonym Promicromonospora pachnodae). 

In other particularly preferred embodiments, the bacterial source is the family 
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Cellulomonadaceae. In some preferred embodiments, the Cellulomonadaceae spp. includes 
and/or is selected from the group of Cellulomonas fimi (ATCC 484, DSM 201 13), 
Cellulomonas biazotea (ATCC 486, DSM 201 12), Cellulomonas cellasea (ATCC 487, 
21681, DSM 201 18), Cellulomonas denverensis, Cellulomonas hominis (DSM 9581), 
Cellulomonas flavigena (ATCC 482, DSM 20109), Cellulomonas persica (ATCC 700642, 
DSM 14784), Cellulomonas iranensis (ATCC 700643, DSM 14785); Cellulomonas 
fermentans (ATCC 43279, DSM 3133), Cellulomonas gelida (ATCC 488, DSM 201 1 1 , DSM 
20110), Cellulomonas humilata (ATCC 25174, basonym Actinomyces humiferus), 
Cellulomonas uda (ATCC 491 , DSM 20107), Cellulomonas xylanilytica (LMG 21723), . 
Cellulomonas septica, Cellulomonas parahominis, Oerskovia turbata (ATCC 25835, DSM 
20577 synonym Cellulomonas turbata), Oerskovia jenensis (DSM 46000), Oerskovia 
enterophila (ATCC 35307, DSM 43852, basonym Promicromonospora enterophila), 
Oerskovia paurometabola (DSM 14281), and Cellulomonas strain 69B4 (DSM 16035). In 
further embodiments, the bacterial source also includes and/or is selected from the group of 
Thermobifida spp., Rarobacter spp. , and/or Lysobacter spp. In yet additional embodiments, 
the Thermobifida spp. is Thermobifida fusca (basonym Thermomonospora fusca) (tfpA, 
AAC23545; See, Lao et al, AppL Environ. Microbiol., 62: 4256-4259 [1996]). In an 
alternative embodiment, the Rarobacter spp. is Rarobacter faecitabidus (RPI, A45053; See 
e.g., Shimoi et a/., J. Biol. Chem., 267:25189-25195 [1992]). In yet another embodiment, 
the Lysobacter spp. is Lysobacter enzymogenes. 

In further embodiments, the present invention provides polypeptides and/or 
polynucleotides obtained and/or isolated from fungal sources. In some embodiments, the 
fungal source includes a Metarhizium spp. In some preferred embodiments, the fungal 
source is a Metarhizium anisopliae (CHY1 (CAB60729). 

In another embodiment, the present invention provides polypeptides and/or 
polynucleotides derived from a Cellulomonas strain selected from cluster 2 of the taxonomic 
classification described in U.S. Pat. No 5,401,657, herein incorporated by reference. In US 
Patent 5,401,657, twenty strains of bacteria isolated from in and around alkaline lakes were 
assigned to the type of bacteria known as Gram-positive bacteria on the basis of: (1) the 
Dussault modification of the Gram's staining reaction (Dussault, J. Bacteriol., 70:484-485 
[1955]); (2) the KOH sensitivity test (Gregersen, Eur. J. Appl. Microbiol. Biotechnol., 5:123- 
127 [1978]; Halebian etaL, J. Clin. Microbiol., 13:444-448 [1981]; and (3) the 
aminopeptidase reaction (Cerny, Eur. J. Appl. Microbiol., 3:223-225 [1976]; Cerny, Eur. J. 
Appl. Microbiol., 5:1 13-122 [1978]). In addition, in most cases, confirmation was also made 
on the basis of quinone analysis (Collins and Jones, Microbiol. Rev., 45:316-354 [1981]) 
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using the method described by Collins (See, Collins, In Goodfellow and Minnikin (eds), 
Chemical Methods in Bacterial Svstematics. Academic Press, London [1985], pp. 267-288). 
In addition,. strains can be tested for 200 characters and the results analyzed using the 
principles of numerical taxonomy (See e.g., Sneath and Sokal, Numerical Taxonomy . W.H. 
Freeman & Co.,. San Francisco, CA [1973]). Exemplary characters tested, testing 
methods, and codification methods are also described in U.S. Pat. 5,401,657. 

As described in U.S. Pat. No. 5,401 ,657, the phenetic data, consisting of 200 unit 
characters was scored and set out in the form of an "n.times.t" matrix, whose t columns 
represent the T bacterial strains to be grouped on the basis of resemblances, and whose 
"n" rows are the unit characters. Taxonomic resemblance of the bacterial strains was 
estimated by means of a.similarity coefficient (Sneath and Sokal, supra, pp. 114-187). 
Although many different coefficients have been used for biological classification, only a few 
have found regular use in bacteriology. Three association Coefficients (See e.g., Sneath 
and Sokal, supra, at p. 129), namely, the Gower, Jaccard and Simple Matching coefficients 
were applied. These have been frequently applied to the analysis of bacteriological data and 
are widely accepted by those skilled in the art, as they have been shown to result in robust 
classifications. 

The coded data were analyzed using the TAXPAK program package (Sackin, Meth. 
Microbiol., 19:459-494 [1987]), run on a DEC VAX computer at the University of Leicester, 
U.K. 

A similarity matrix was constructed for all pairs of strains using the Gower Coefficient 
(S Q ) with the option of permitting negative matches (See, Sneath and Sokal, supra, at pp. 
135-136), using the RTBNSIM program in TAXPAK. As the primary instrument of analysis 
and the one upon which most of the taxonomic data presented herein are based, the Gower 
Coefficient was chosen over other coefficients for generating similarity matrices because it 
is applicable to all types of characters or data, namely, two-state, multistate (ordered and 
qualitative), and quantitative. 

Cluster analysis of the similarity matrix was accomplished using the Unweighted Pair 
Group Method with Arithmetic Averages (UPGMA) algorithm, also known as the Unweighted 
Average Linkage procedure, by running the SMATCLST sub-routine in TAXPAK. 

Dendrograms illustrate the levels of similarity between bacterial strains In some 
embodiments, dendrograms are obtained by using the DENDGR program in TAXPAK. The 
phenetic data were re-analyzed using the Jaccard Coefficient (Sj) (Sneath and Sokal, supra, 
at p.131) and Simple Matching Coefficient (S SM ) (Sneath, P.H.A. and Sokal, R.R., ibid, p. 
132) by running the RTBNSIM program in TAXPAK. An additional two dendrograms were 
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obtained by using the SMATCLST with UPGMA option and DENDGR sub-routines in 
TAXPAK. 

Using the S G /UPGMA method, six natural clusters or phenons of alkalophilic 
bacteria were generated at the 79% similarity level. These six clusters included 15 of the 20 
alkalophilic bacteria isolated from alkaline lakes. Although the choice of 79% for the level of 
delineation was arbitrary, it was in keeping with current practices in numerical taxonomy 
(See e.g., Austin Priest, Modern Bacterial Taxonomy . Van Nostrand Reinhold, Wokingham, 
U.K., [1986], p. 37). Placing the delineation at a lower percentage would combine groups of 
clearly unrelated organisms whose definition is not supported by the data. At the 79% level, 
3 of the clusters exclusively contain novel alkalophilic bacteria representing 13 of the newly 
isolated strains (potentially representing new taxa). Protease 69B4 was classified as in 
cluster 2 by this method. 

The significance of the clustering at this level was supported by the results of the 
TESTDEN program. This program tests the significance of all dichotomous pairs of clusters 
(comprising 4 or more strains) in a UPGMA generated dendrogram with Squared Euclidean 
distances, or their complement as a measurement and assuming that the clusters are 
hyperspherical. The critical overlap was set at 0.25%. The separation of the clusters is 
highly significent. . 

The Sj coefficient is a useful adjunct to the- So coefficient, as it can be used to detect ' 
phenons in the latter that are based on negative matches or distortions owing to undue 
weight being put on potentially subjective qualitative data. Consequently, the Sj coefficient 
is useful for confirming the validity of clusters defined initially by the use of the S G 
coefficient. The Jaccard Coefficient is particularly useful in comparing biochemically 
unreactive organisms (Austin and Priest, supra, at p. 37). In addition, there may be some 
question about the admissibility of matching negative character states (See, Sneath and 
Sokal, supra, at p. 131), in which case the Simple Matching Coefficient is a widely applied 
alternative. Strain 69B4 was classified as in cluster 2 by this method. 

In the main, all of the clusters (especially the clusters of the new bacteria) generated 
by the S G /UPGMA method were recovered in the dendrograms produced by the Sj 
/UPGMA method (cophenetic correlation, 0.795), and the S S m /UPGMA method (cophenetic 
correlation, 0.814). The main effect of these transformations was to gather all the Bacillus 
strains in a single large cluster which further serves to emphasize the separation between 
the alkalophilic Bacillus species and the new alkalophilic bacteria, and the uniqueness of the 
latter. Based on these methodologies, 69B4 is considered to be a cluster 2 bacterium. 

In other aspects of the present invention, the polynucleotide is derived from a 
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bacteria having a 16S rRNA gene nucleotide sequence at least 70%, 75%, 80%, 85%, 88%, 
90%, 92%, 95%, 98% sequence identity with the 16S rRNA gene nucleotide sequence of 
Cellulomonas strain 69B4. The sequence of the 16S rRNA gene is deposited at GenBank 
under Accession Number X92152. 

Figure 1 provides an unrooted phylogenetic tree illustrating the relationship of novel 
strain 69B4 to members of the family Cellulomonadaceae (including Cellulomonas strain 
69B4) and other related genera of the suborder Micrococcineae. The dendrogram was 
constructed from aligned 16S rDNA sequences (1374 nt) using TREECONW v.1.3b (Van de 
Peer and De Wachter, Comput. AppL BioscL, 10: 569-570 [1994]). Distance estimations 
were calculated using the substitution rate calibration of Jukes and Cantor (Jukes and 
Cantor, "Evolution of protein molecules," ln t Munro (ed.), Mammalian Protein Metabolism, 
Academic Press, NY, at pp.21 -132, [1969]) and tree topology inferred by the Neighbor- 
Joining algorithm (Saitou and Nei, Mol. Biol. EvoL, 4:406-425 [1987]). The numbers at the 
nodes refer to bootstrap values from 100 resampled data sets (Felsenstein, Evol., 39:783- 
789 [1985]) and the scale bar indicates 2 nucleotide substitutions in 100 nt. 

The strain 69B4 exhibits the closest 16S rDNA relationship to members of 
Cellulomonas and Oerskovia of the family Cellulomonadaceae. The closest relatives are 
believed to be C. cellasea (DSM 201 18) and C. fimi (DSM 201 13), with at least 95% 
sequence identity with the 16S rRNA gene nucleotide sequence of Cellulomonas strain 
69B4 (e.g., 96% and 95% identity respectively) to strain 69B4 16S rRNA gene sequence. 

In some preferred embodiments of the present invention, the Cellulomonas spp. is 
Cellulomonas strain 69B4 (DSM16035). This strain was originally isolated from a sample of 
sediment and water from the littoral zone of Lake Bogoria, Kenya at Acacia Camp (Lat. 0° 
12'N, Long. 36° 07'E) collected on 10 October 1988. The water temperature was 33°C, pH 
10.5 with a conductivity of 44 mS/cm. Cellulomonas strain 69B4 was determined to have 
the phenotypic characteristics described below. Fresh cultures were Gram-positive, slender, 
generally straight, rod-shaped bacteria, approximately 0.5-0.7|xm x 1 .8-4|wn. Older cultures 
contained mainly short rods and coccoid cells. Cells occasionally occurred in pairs or as V- 
forms, but primary branching was not observed. Endospores were not detected. On 
alkaline GAM agar the strain forms opaque, glistening, pale-yellow coloured, circular and 
convex or domed colonies, with entire margins, about 2 mm in diameter after 2-3 days 
incubation at 37°C. The colonies were viscous or slimy with a tendency to clump when 
scraped with a loop. On neutral Tryptone Soya Agar, strain growth was less vigorous, 
giving translucent yellow colonies, generally <1 mm in diameter. The cultures were 
facultatively anaerobic, as they were capable of growth under strictly anaerobic conditions. 
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However, growth under anaerobic conditions was markedly reduced compared to aerobic 
growth. The strain also appeared to be negative in standard oxidase, urease, 
aminopeptidase, and KOH tests. In addition, nitrate was not reduced, although the 
organisms were catalase positive and DNase was produced under alkaline conditions. The 
preferred temperature range for growth was 20 - 37°C, with an optimum temperature at 
around 30-37°C. No growth was observed at 15°C or 45°C. 

The strain is alkalophilic and slightly halophilic. The strain may also be characterized 
as having growth occurring at pH values between 6.0 and 10.5 with an optimum around pH 
9-10. No growth was observed at pH 1 1 or pH 5.5. Growth below pH 7 was less vigorous 
and abundant than that of cultures grown at the optimal temperature. The strain was 
observed to grow in medium containing 0-8% (w/v) NaCI. Furthermore, the strain may also 
be characterized as a chemo-organotroph, since it grew on complex substrates such as 
yeast extract and peptone; and hydrolyzed starch, gelatin, casein, carboxymethylcellulose 
and amorphous cellulose. 

The strain was observed to have metabolism that was respiratory and also 
fermentative. Acid was produced both aerobically and anaerobically from (API 50CH): L- 
arabinose, D-xylose, D-glucose, D-f ructose, D-mannose, rhamnose (weak), cellobiose, 
maltose, sucrose, trehalose, gentiobiose, D-turanose, D-lyxose and 5-keto-gluconate 
(weak). Amygdalin, arbutin, salicin and esculin are also utilized. The strain was unable to 
utilize: ribose, lactose, galactose, melibiose, D-raffinose, glycogen, glycerol, erythritbl, 
inositol, mannitol, sorbitol, xylitol, arabitol, gluconate and lactate. 

The strain was determined to be susceptible to ampicillin, chloramphenicol, 
erythromycin, fusidic acid, methicillin, novobiocin, streptomycin, tetracycline, sulphafurazole, 
oleandomycin, polymixin, rifampicin, vancomycin and bacitracin; but resistant to gentamicin, 
nitrofurantoin, nalidixic acid, sulphmethoxazole, trimethoprim, penicillin G, neomycin and 
kanamycin. 

The following enzymes, aside from the protease of the present invention, were 
observed to be produced (ApiZym, API Coryne); C4-esterase, C8-esterase/lipase, leucine 
arylamidase, alpha-chymotrypsin, alpha-glucosidase, beta-glucosidase and pyrazinamidase. 

The strain was observed to exhibit the following chemotaxonomic characteristics. 
Major fatty acids (>10% of total) were C16:1 (28.1%), C18:0 (31.1%), C18:1 (13.9%). N- 
saturated (79.1%), n-unsaturated (1'9.9%). Fatty acids with even numbers of carbons 
accounted for 98%. Main polar lipid components: phosphatidylglycerol (PG) and 3 
unidentified glycolipids (alpha-napthol positive) were present; DPG, PGP, PI and PE were 
not detected. Menaquinones MK-4, MK-6, MK-7 and MK-9 were the main isoprenoids 
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present. The cell wall peptidoglycan type was A4p" with L-ornithine as diamino acid and D- 
aspartic acid in the interpeptide bridge. With regard to toxicity evaluation, there are no 
known toxicity or pathogenicity issues associated with bacteria of the genus Cellulomonas. 

Although there may be variations in the sequence of a naturally occurring enzyme 
within a given species of organism, enzymes of a specific type produced by organisms of 
the same species generally are substantially identical with respect to substrate specificity 
and/or proteolytic activity levels under given conditions (e.g., temperature, pH, water 
hardness, oxidative conditions, chelating conditions, and concentration), etc. Thus, for the 
purposes of the present invention, it is contemplated that other strains and species of 
Cellulomonas also produce the Cellulomonas protease of the present invention and thus 
provide useful sources for the proteases of the present invention. Indeed, as presented 
herein, it is contemplated that other members of the Micrococcineae will find use in the 
present invention. 

In some embodiments, the proteolytic polypeptides of this invention are 
characterized physicochemically, while in other embodiments, they are characterized based 
on their functionally, while in further embodiments, they are characterized using both sets of 
properties. Physicochemical characterization takes advantages of well known techniques 
such as SDS electrophoresis, gel filtration, amino acid composition, mass spectrometry 
(e.g,. MALDI-TOF-MS, LC-ES-MS/MS, etc.), and sedimentation to determine the molecular 
weight of proteins, isoelectric focusing to determine the pi of proteins, amino acid 
sequencing to determine the amino acid sequences of protein, crystallography studies to 
determine the tertiary structures of proteins, and antibody binding to determine antigenic 

epitopes present in proteins. 

I n some embodiments, functional characteristics are determined by techniques well 
known to the practitioner in the protease f ield and include, but are not limited to, hydrolysis 
of various commercial substrates, such as di-methyl casein ("DMC") and/or AAPF-pNA. 
This preferred technique for functional characterization is described in greater detail in the 

Examples provided herein. 

In some embodiments of the present invention, the protease has a molecular weight 
of about 17kD to about 21 kD, for example about 18kD to 19kD, for example 18700 daltons 
to 18800 daltons, for example about 18764 daltons, as determined by MALDI-TOF-MS). In 
another aspect of the present invention, the protease measured MALDI-TOF-MS spectrum 

as set forth in Figure 3. 

The mature protease also displays proteolytic activity (e.g., hydrolytic activity on a 
substrate having peptide linkages) such as DMC. In further embodiments, proteases of the 
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present invention provide enhanced wash performance under identified conditions. 
Although the present invention encompasses the protease 69B as described herein, in 
some embodiments, the proteases of the present invention exhibit at least 50%, 60%, 70%, 
75%, 80%, 85%, 90%, 92%, 95%, 96%, 97%, 98% or 99% proteolytic activity as compared 
to the proteolytic activity of 69B4. In some embodiments, the proteases display at least 
50%, 60%, 70%, 75%, 80%, 85%, 90%, 92%, 95%, 96%, 97%, 98% or 99% proteolytic 
activity as compared to the proteolytic activity of proteases sold under the tradenames 
SAVINASE® (Novzymes) or PURAFECT® (Genencor) under the same conditions. In some 
embodiments, the proteases of the present invention display comparative or enhanced wash 
performance under identified conditions as compared to 69B4 under the same conditions. 
In some preferred embodiments, the proteases of the present invention display comparative 
or enhanced wash performance under identified conditions, as compared to proteases sold 
under the tradenames SAVINASE® (Novozymes) or PURAFECT® (Genencor) under the 
same conditions. 

In yet further embodiments, the proteases and/or polynucleotides encoding the 
proteases of the present invention are provided purified form (i.e., present in a particular 
composition in a higher or lower concentration than exists in a naturally occurring or wild 
type organism), or in combination with components not normally present upon expression 
from a naturally occurring or wild-type organism. However, it is not intended that the 
present invention be limited to proteases of any specific purity level, as ranges of protease 
purity find use in various applications in which the proteases of the present inventing are 
suitable. 

III. Obtaining Polynucleotides Encoding Micrococcineae 
(e.g., Cellulomonas) Proteases of the Present Invention 

In some embodiments, nucleic acid encoding a protease of the present invention is 
obtained by standard procedures known in the art from, for example, cloned DNA {e.g., a 
DNA "library"), chemical synthesis, cDNA cloning, PCR, cloning of genomic DNA or 
fragments thereof, or purified from a desired cell, such as a bacterial or fungal species (See, 
for example, Sambrook et al., supra [1989]; and Glover and Hames (eds.), DNA Cloning: A 
Practical Approach , Vols 1 and 2, Second Edition). Synthesis of polynucleotide sequences 
is well known in the art (See e.g., Beaucage and Caruthers, Tetrahedron Lett., 22:1859- 
1862 [1981]), including the use of automated synthesizers (See e.g., Needham- 
VanDevanter et al., Nucl. Acids Res., 12:6159-6168 [1984]). DNA sequences can also be 
custom made and ordered from a variety of commercial sources. As described in greater 
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detail herein, in some embodiments, nucleic acid sequences derived from genomic DNA 
contain regulatory regions in addition to coding regions. 

In some embodiments involving the molecular cloning of the gene from genomic DNA, 
DNA fragments are generated, some of which comprise at least a portion of the desired gene. 
In some embodiments, the DNA is cleaved at specific sites using various restriction enzymes. 
In some alternative embodiments, DNAse is used in the presence of manganese to fragment 
the DNA, or the DNA is physically sheared {e.g., by sonication). The linear DNA fragments 
created are then be separated according to size and amplified by standard techniques, 
including but not limited to, agarose and polyacrylamide gel electrophoresis, PCR and column 
chromatography. 

Once nucleic acid fragments are generated, identification of the specific DNA 
fragment encoding a protease may be accomplished in a number of ways. For example, in 
some embodiments, a proteolytic hydrolyzing enzyme encoding the asp gene or its specific 
RNA, or a fragment thereof, such as a probe or primer, is isolated, labeled, and then used in 
hybridization assays well known to those in the art, to detect a generated gene (See e.g., 
Benton and Davis, Science 196:180 [1977]; and Grunstein and Hogness, Proc. Natl. Acad. 
Sci. USA 72:3961 [1975]). In preferred embodiments, DNA fragments sharing substantial 
sequence similarity to the probe hybridize under medium to high stringency. 

In some preferred embodiments, amplification is accomplished using PCR, as known 
in the art. In some preferred embodiments, a nucleic acid sequence of at least about 4 
nucleotides and as many as about 60 nucleotides from SEQ ID NOS:1, 2, 3 and/or 4 (i.e., 
fragments), preferably about 12 to 30 nucleotides, and more preferably about 25 nucleotides 
are used in any suitable combinations as PCR primer. These same fragments also find use 
as probes in hybridization and product detection methods. 

In some embodiments, isolation of nucleic acid constructs of the invention from a 
cDNA or genomic library utilizes PCR with using degenerate oligonucleotide primers 
prepared on the basis of the amino acid sequence of the protein having the amino acid 
sequence as shown in SEQ ID NOS:1 -5. The primers can be of any segment length, for 
example at least 4, at least 5, at least 8, at least 15, at least 20, nucleotides in length. 
Exemplary probes in the present application utilized a primer comprising a TTGWHCGT and 
a GDSGG polynucleotide sequence as more fully described in Examples. 

In view of the above, it will be appreciated that the polynucleotide sequences 
provided herein and based on the polynucleotide sequences provided in SEQ ID NOS:1-5 
are useful for obtaining identical or homologous fragments of polynucleotides from other 
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species, and particularly from bacteria that encode enzymes having the serine protease 
activity expressed by protease 69B4. 

IV. Expression and Recovery of Serine Proteases of the Present Invention 

Any suitable means for expression and recovery of the serine proteases of the 
present invention find use herein. Indeed, those of skill in the art know many methods 
suitable for cloning a Cellulomonas-defwed polypeptide having proteolytic activity, as well as 
an additional enzyme {e.g., a second peptide having proteolytic activity, such as a protease, 
cellulase, mannanase, or amylase, etc.). Numerous methods are also known in the art for 
introducing at least one (e.g., multiple) copies of the polynucleotide(s) encoding the 
enzyme(s) of the present invention in conjunction with any additional sequences desired, 
into the genes or genome of host cells. 

In general, standard procedures for cloning of genes and introducing exogenous 
proteases encoding regions (including multiple copies of the exogenous encoding regions) 
into said genes find use in obtaining a Cellulomonas 69B4 protease derivative or homologue 
thereof. Indeed, the present Specification, including the Examples provides such teaching. 
However, additional methods known in the art are also suitable (See e.g., Sambrook etal. 
supra (1989); Ausubel etal., supra [1995]; and Harwood and Cutting, (eds.) Molecular 
Biological Methods for Bacillus/ 1 John Wiley and Sons, [1990]; and WO 96/34946). 

In some preferred embodiments, the polynucleotide sequences of the present 
invention are expressed by operatively linking them to an expression control sequence in an 
appropriate expression vector and employed by that expression vector to transform an 
appropriate host according to techniques well established in the art. In some embodiments, 
the polypeptides produced on expression of the DNA sequences of this invention are 
isolated from the fermentation of cell cultures and purified in a variety of ways according to 
well established techniques in the art. Those of skill in the art are capable of selecting the 
most appropriate isolation and purification techniques. 

More particularly, the present invention provides constructs, vectors comprising 
polynucleotides described herein, host cells transformed with such vectors, proteases 
expressed by such host cells, expression methods and systems for the production of serine 
protease enzymes derived from microorganisms, in particular, members of the 
Micrococcineae, including but not limited to Cellulomonas species. In some embodiments, 
the polynucleotide(s) encoding serine protease(s) are used to produce recombinant host 
cells suitable for the expression of the serine protease(s). In some preferred embodiments, 
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the expression hosts are capable of producing the protease(s) in commercially viable 
quantities. 

IV. Recombinant Vectors 

As indicated above, in some embodiments, the present invention provides vectors 
comprising the aforementioned polynucleotides. In some embodiments, the vectors (i.e., 
constructs) of the invention encoding the protease are of genomic origin (e.g., prepared 
though use of a genomic library and screening for DNA sequences coding for all or part of 
the protease by hybridization using synthetic oligonucleotide probes in accordance with 
standard techniques). In some preferred embodiments, the DNA sequence encoding the 
protease is obtained by isolating chromosomal DNA from the Cellulomonas strain 69B4 and 
amplifying the sequence by PCR methodology (See, the Examples). 

In alternative embodiments, the nucleic acid construct of the invention encoding the 
protease is prepared synthetically by established standard methods (See e.g., Beaucage 
and Caruthers, Tetra. Lett. 22:1859-1869 [1981]; and Matthes etai, EMBO J., 3:801-805 
[1984]). According to the phosphoramidite method, oligonucleotides are synthesized (e.g., 
in an automatic DNA synthesizer), purified, annealed, ligated and cloned in suitable vectors.. 

In additional embodiments, the nucleic acid construct is of mixed synthetic and 
genomic origin. In some embodiments, the construct is prepared by ligating fragments of 
synthetic or genomic DNA (as appropriate), wherein the fragments correspond to various 
parts of the entire nucleic acid construct, in accordance with standard techniques. 

In further embodiments, the present invention provides vectors comprising at least 
one DNA construct of the present invention. In some embodiments, the present invention 
encompasses recombinant vectors. It is contemplated that any suitable vector will find use 
in the present invention, including autonomously replicating vector a well as vectors that 
integrate (either transiently or stably) within the host cell genome). Indeed, a wide variety of 
vectors, and expression cassettes suitable for the cloning, transformation and expression in 
fungal (mold and yeast), bacterial, insect and plant cells are known to those of skill in the 
art. Typically, the vector or cassette contains sequences directing transcription and 
translation of the nucleic acid, a selectable marker, and sequences allowing autonomous 
replication or chromosomal integration. In some embodiments, suitable vectors comprise a 
region 5' of the gene which harbors transcriptional initiation controls and a region 3' of the 
DNA fragment which controls transcriptional termination. These control regions may be 
derived from genes homologous or heterologous to the host as long as the control region 
selected is able to function in the host cell. 
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The vector is preferably an expression vector in which the DNA sequence encoding 
the protease of the invention is operably linked to additional segments required for 
transcription of the DNA. In some preferred embodiments, the expression vector is derived 
from plasmid or viral DNA, or in alternative embodiments, contains elements of both. 
Exemplary vectors include, but are not limited to pSEGCT, pSEACT, and/or pSEA4CT, as 
well as all of the vectors described in the Examples herein. Construction of such vectors is 
described herein, and methods are well known in the art (See e.g., U.S. Pat. No. 6,287,839; 
and WO 02/50245). In some preferred embodiments, the vector pSEGCT (about 8302 bp; 
See, Figure 5) finds use in the construction of a vector comprising the polynucleotides 
described herein (e.g., pSEG69B4T; See, Figure 6). In alternative preferred embodiments, 
the vector pSEA469B4CT (See, Figure 7) finds use in the construction of a vector 
comprising the polynucleotides described herein. Indeed, it is intended that all of the 
vectors described herein will find use in the present invention. 

In some embodiments, the additional segments required for transcription include 
regulatory segments (e.g., promoters, secretory segments, inhibitors, global regulators, 
etc.), as known in the art. One example includes any DNA sequence that shows ■ 
transcriptional activity in the host cell of choice and is derived from genes, encoding proteins 
either homologous or heterologous to the host cell. Specifically, examples of suitable 
promoters for use in bacterial host cells include but are not limited to the promoter of the 
Bacillus stearothermophilus maltogenic amylase gene, the Bacillus amyloliquefaciens (BAN) 
amylase gene, the Bacillus subtilis alkaline protease gene, the Bacillus clausii alkaline 
protease gene the Bacillus pumilus xylosidase gene, the Bacillus thuringiensis crylllA, and 
the Bacillus licheniformis alpha-amylase gene. Additional promoters include the A4 
promoter, as described herein. Other promoters that find use in the present invention 
include, but are not limited to phage Lambda P R or P L promoters, as well as the E. coli lac, 
trp or tac promoters. 

In some embodiments, the promoter is derived from a gene encoding said protease 
or a fragment thereof having substantially the same promoter activity as said sequence. 
The invention further encompasses nucleic acid sequences which hybridize to the promoter 
sequences under intermediate, high, and/or maximum stringency conditions, or which have 
at least about 90% homology and preferably about 95% homology to such promoter, but 
which have substantially the same promoter activity. In some embodiments, this promoter is 
used to promote the expression of either the protease and/or a heterologous DNA sequence 
(e.g., another enzyme in addition to the protease of the present invention). In additional 
embodiments, the vector also comprises at least one selectable marker. 
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In some embodiments, the recombinant vectors of the invention further comprise a 
DNA sequence enabling the vector to replicate in the host cell. In some preferred 
embodiments involving bacterial host cells, these sequences comprise all the sequences 
needed to allow plasmid replication (e.g., ori and/or rep sequences). 

In some particularly preferred embodiments, signal sequences (e.g., leader 
sequence or pre sequence) are also included in the vector, in order to direct a polypeptide of 
the present invention into the secretory pathway of the host cells. In some more preferred 
embodiments, a secretory signal sequence is joined to the-DNA sequence encoding the 
precursor protease in the correct reading frame (See e.g., SEQ ID NOS:1 and 2). 
Depending on whether the protease is to be expressed intracellular^ or is secreted, a 
polynucleotide sequence or expression vector of the invention is engineered with or without 
a natural polypeptide signal sequence or a signal sequence which functions in bacteria (e.g., 
Bacillus sp.), fungi (e.g., Trichoderma), other prokaryoktes or eukaryotes. In some 
embodiments, expression is achieved by either removing or partially removing the, signal 
sequence 

In some embodiments involving secretion from bacterial cells, the signal peptide is a 
naturally occurring signal peptide, or a functional part thereof, while in other embodiments, it 
is a synthetic peptide. Suitable signal peptides include but are not limited to sequences 
derived from Bacillus licheniformis alpha-amylase, Bacillus clausii alkaline protease, and 
Bacillus amyloliquefaciens amylase. One preferred signal sequence is the signal peptide 
derived from Cellulomonas strain 69B4, as described herein. Thus, in some particularly 
preferred embodiments, the signal peptide comprises the signal peptide from the protease 
described herein. This signal finds use in facilitating the secretion of the 69B4 protease 
and/or a heterologous DNA sequence (e.g. a second protease, such as another wild-type 
protease, a BPISP variant protease, a GG36 variant protease, a lipase, a cellulase, a 
mannanase, etc.). In some embodiments, these second enzymes are encoded by the DNA 
sequence and/or the amino acid sequences known in the art (See e.g., U.S. Pat. Nos. 
6,465,235, 6,287,839, 5,965,384, and 5,795,764; as well as WO 98/22500, WO 92/05249, 
EP 030521 6B1 , and WO 94/25576). Furthermore, it is contemplated that in some 
embodiments, the signal sequence peptide is also be operatively linked to an endogenous 
sequence to activate and secrete such endogenous encoded protease. 

The procedures used to ligate the DNA sequences coding for the present protease, 
the promoter and/or secretory signal sequence, respectively, and to insert them into suitable 
vectors containing the information necessary for replication, are well known to those skilled 
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in the art. As indicated above, in some embodiments, the nucleic acid construct is prepared 
using PCR with specific primers. 

V. Host Cells 

As indicated above, in some embodiments, the present invention also provides host 
cells transformed with the vectors described above. In some embodiments, the 
polynucleotide encoding the protease(s) of the present invention that is introduced into the 
host cell is homologous, while in other embodiments, the polynucleotide is heterologous to 
the host. In some embodiments in which the polynucleotide is homologous to the host cell 
(e.g., additional copies of the native protease produced by the host cell are introduced), it is 
operably connected to another homologous or heterologous promoter sequence. In 
alternative embodiments, another secretory signal sequence, and/or terminator sequence 
find use in the present invention. Thus, in some embodiments, the polypeptide DNA 
sequence comprises multiple copies of a homologous polypeptide sequence, a 
heterologous polypeptide sequence from another organism, or synthetic polypeptide 
sequence(s). Indeed, it is not intended that the present invention be limited to any particular 
host cells and/or vectors. 

Indeed, the host cell into which the DNA construct of the present invention is 
introduced may be any cell which is capable of producing the present alkaline protease, 
including, but not limited to bacteria, fungi, and higher eukaryotic cells. 

Examples of bacterial host cells which find use in the present invention include, but 
are not limited to Gram-positive bacteria such as Bacillus, Streptomyces, and Thermobifida, 
for example strains of B. subtilis, B. licheniformis, B. lentus, B. brevis, B. 
stearothermophilus, B. clausii, B. amyloliquefaciens, B. coagulans, B. circulans, B. iautus, B. 
megaterium, B. thuringiensis, S. griseus, S. lividans, S. coelicolor, S. avermitilis and T. 
fusca; as well as Gram-negative bacteria such as members of the Enterobacteriaceae (e.g., 
Escherichia coli). In some particularly preferred embodiments, the host cells are B. subtilis, 
B. clausii, and/or B. licheniformis. In additional preferred embodiments, the host cells are 
strains of S. lividans (e.g., TK23 and/or TK21). Any suitable method for transformation of 
the bacteria find use in the present invention, including but not limited to protoplast 
transformation, use of competent cells, etc., as known in the art. In some preferred 
embodiments, the method provided in U.S. Pat. No. 5,264,366 (incorporated by reference 
herein), finds used in the present invention. For S. lividans, one preferred means for 
transformation and protein expression is that described by Fernandez-Abalos etal. (See, 
Fernandez-Abalos etal., Microbiol., 149:1623-1632 [2003]; See also, Hopwood, etal., 
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Genetic Manipulation of Streptomyces: Laboratory Manual, Innis [1985], both of which are 
incorporated by reference herein). Of course, the methods described in the Example herein 
find use in the present invention. 

Examples of fungal host cells which find use in the present invention include, but are 
not limited to Trichoderma spp. and Aspergillus spp. In some particularly preferred 
embodiments, the host cells are Trichoderma reesei and/or Aspergillus niger. In some 
embodiments, transformation and expression in Aspergillus is performed as described in 
U.S. Pat. 5,364,770, herein incorporated by reference. Of course, the methods described in 
the Example herein find use in the present invention. 

In some embodiments, particular promoter and signal sequences are needed to 
provide effective transformation and expression of the protease(s) of the present invention. 
Thus, in some preferred embodiments involving the use of Bacillus host cells, the aprE 
promoter is used in combination with known Bac///i/s-derived signal and other regulatory 
sequences. In some preferred embodiments involving expression in Aspergillus, the glaA 
promoter is used. In some embodiments involving Streptomyces host cells, the glucose 
isomerase (Gl) promoter of Actinoplanes missouriensis is used, while in other embodiments, 
the A4 promoter is used. 

In some embodiments involving expression in bacteria such as E coli, the protease 
is retained in the cytoplasm, typically as insoluble granules (i.e., inclusion bodies). 
However, in other embodiments, the protease is directed to the periplasmic space by a 
bacterial secretion sequence. In the former case, the cells are lysed, and the granules are 
recovered and denatured after which the protease is refolded by diluting the denaturing 
agent. In the latter case, the protease is recovered from the periplasmic space by disrupting 
the cells {e.g., by sonication or osmotic shock), to release the contents of the periplasmic 
space and recovering the protease. 

In preferred embodiments, the transformed host cells of the present invention are 
cultured in a suitable nutrient medium under conditions permitting the expression of the 
present protease, after which the resulting protease is recovered from the culture. The 
medium used to culture the cells comprises any conventional medium suitable for growing 
the host cells, such as minimal or complex media containing appropriate supplements. 
Suitable media are available from commercial suppliers or may be prepared according to 
published recipes (e.g., in catalogues of the American Type Culture Collection). In some 
embodiments, the protease produced by the cells is recovered from the culture medium by 
conventional procedures, including, but not limited to separating the host cells from the 
medium by centrifugation or filtration, precipitating the proteinaceous components of the 



WO 2005/052146 



PCT/US2004/039066 



-80- 

supernatant or filtrate by means of a salt (e.g., ammonium sulfate), chromatographic 
purification (e.g., ion exchange, gel filtration, affinity, etc.). Thus, any method suitable for 
recovering the protease(s) of the present invention will find use. Indeed, it is not intended 
that the present invention be limited to any particular purification method. 

VI. Applications for Serine Protease Enzymes 

As described in greater detail herein, the proteases of the present invention have 
important characteristics that make them very suitable for certain applications. For example, 
the proteases of the present invention have enhanced thermal stability, enhanced oxidative 
stability, and enhanced chelator stability, as compared to some currently used proteases. 

Thus, these proteases find use in cleaning compositions. Indeed, under certain 
wash conditions, the present proteases exhibit comparative or enhanced wash performance 
as compared with. currently used subtilisin proteases. Thus, it is contemplated that the 
cleaning and/or enzyme compositions of the present invention will be provided in a variety of 
cleaning compositions. In some embodiments, the proteases of the present invention are 
utilized in the same manner as subtilisin. proteases (i.e., proteases currently in use). Thus, 
the present proteases find use in various cleaning compositions, as well as animal feed 
applications, leather processing (e.g., bating), protein hydrolysis, and in textile uses. The 
identified proteases also find use in personal care applications. . 

Thus, the proteases of the present invention find use in a number of industrial 
applications, in particular within the cleaning, disinfecting, animal feed, and textile/leather 
industries. In some embodiments, the protease(s) of the present invention are combined 
with detergents, builders, bleaching agents and other conventional ingredients to produce a 
variety of novel cleaning compositions useful in the laundry and other cleaning arts such as, 
for example, laundry detergents (both powdered and liquid), laundry pre-soaks, all fabric 
bleaches, automatic dishwashing detergents (both liquid and powdered), household 
cleaners, particularly bar and liquid soap applications, and drain openers. In addition, the 
protease find use in the cleaning of contact lenses, as well as other items, by contacting 
such materials with an aqueous solution of the cleaning composition. In addition these 
naturally occurring proteases can be used, for example in peptide hydrolysis, waste 
treatment, textile applications, medical device cleaning, biofilm removal and as fusion- 
cleavage enzymes in protein production, etc. The composition of these products is not 
critical to the present invention, as long as the protease(s) maintain their function in the 
setting used. In some embodiments, the compositions are readily prepared by combining a 
cleaning effective amount of the protease or an enzyme composition comprising the 
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protease enzyme preparation with the conventional components of such compositions in 
their art recognized amounts. 

A. Cleaning Compositions 

The cleaning composition of the present invention may be advantageously employed 
for example, in laundry applications, hard surface cleaning, automatic dishwashing 
applications, as well as cosmetic applications such as dentures, teeth, hair and skin. 
However, due to the unique advantages of increased effectiveness in lower temperature 
solutions and the superior color-safety profile, the enzymes of the present invention are 
ideally suited for laundry applications such as the bleaching of fabrics. Furthermore, the 
enzymes of the present invention may be employed in both granular and liquid 
compositions. 

The enzymes of the present invention may also be employed in a cleaning additive 
product. A cleaning additive product including the enzymes of the present invention is 
ideally suited for inclusion in a wash process when additional bleaching effectiveness is 
desired. Such instances may include, but are not limited to low temperature solution 
cleaning application. The additive product may be, in its simplest form, one or more 
proteases, including ASP. Such additive may be packaged in dosage form for addition to a 
cleaning process where a source of peroxygen is employed and increased bleaching 
effectiveness is desired. Such single dosage form may comprise a pill, tablet, gelcap or 
other single dosage unit such as pre-measured powders or liquids. A filler or carrier 
material may be included to increase the volume of such composition. Suitable filler or 
carrier materials include, but are not limited to, various salts of sulfate, carbonate and 
silicate as well as talc, clay and the like. Filler or carrier materials for liquid compositions 
may be water or low molecular weight primary and secondary alcohols including polyols and 
diols. Examples of such alcohols include, but are not limited to, methanol, ethanol, propanol 
and isopropanol. The compositions may contain from about 5% to about 90% of such 
materials. Acidic fillers can be used to reduce pH. Alternatively, the cleaning additive may 
include activated peroxygen source defined below or the adjunct ingredients as fully defined 
below. 

The present cleaning compositions and cleaning additives require an effective 
amount of the ASP enzyme and/or variants provided herein. The required level of enzyme 
may be achieved by the addition of one or more species of the enzymes of the present 
invention. Typically the present cleaning compositions will comprise at least 0.0001 weight 
percent, from about 0.0001 to about 1, from about 0.001 to about 0.5, or even from about 
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0.01 to about 0.1 weight percent of at least one of the enzymes of the present invention. 

The cleaning compositions herein will typically be formulated such that, during use in 
aqueous cleaning operations, the wash water will have a pH of from about 5.0 to about 1 1 .5 
or even from about 7.5 to about 10.5. Liquid product formulations are typically formulated to 
have a neat pH from about 3.0 to about 9.0 or even from about 3 to about 5. Granular 
laundry products are typically formulated to have a pH from about 9 to about 1 1 . 
Techniques for controlling pH at recommended usage levels include the use of buffers, 
alkalis, acids, etc., and are well known to those skilled in the art. 

Suitable low pH cleaning compositions typically have a neat pH of from about 3 to 
about 5, and are typically free of surfactants that hydrolyze in such a pH environment. Such 
surfactants include sodium alkyl sulfate surfactants that comprise at least one ethylene 
oxide moiety or even from about 1 to 16 moles of ethylene oxide. Such cleaning 
compositions typically comprise a sufficient amount of a pH modifier, such as sodium 
hydroxide, monoethanolamine or hydrochloric acid, to provide such cleaning composition 
with a neat pH of from about 3 to about 5. Such compositions typically comprise at least one 
acid stable enzyme. Said compositions may be liquids or solids. The pH of such liquid 
compositions is measured as a neat pH. The pH of such solid compositions is measured as 
a 10% solids solution of said composition wherein the solvent is distilled water. In these 
embodiments, all pH measurements are taken at 20°C. 

When the serine protease(s) is/are employed in a granular composition or liquid, it 
may be desirable for the enzyme to be in the form of an encapsulated particle to protect 
such enzyme from other components of the granular composition during storage. In 
addition, encapsulation is also a means of controlling the availability of the enzyme during 
the cleaning process and may enhance performance of the enzymes provided herein. In 
this regard, the serine proteases of the present invention may be encapsulated with any 
encapsulating material known in the art. 

The encapsulating material typically encapsulates at least part of the catalyst for the 
enzymes of the present invention. Typically, the encapsulating material is water-soluble 
and/or water-dispersible. The encapsulating material may have a glass transition 
temperature (Tg) of 0°C or higher. Glass transition temperature is described in more detail 
in WO 97/1 1 151, especially from page 6, .line 25 to page 7, line 2. 

The encapsulating material is may be selected from the group consisting of 
carbohydrates, natural or synthetic gums, chitin and chitosan, cellulose and cellulose 
derivatives, silicates, phosphates, borates, polyvinyl alcohol, polyethylene glycol, paraffin 
waxes and combinations thereof. When the encapsulating material is a carbohydrate, it 
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may be typically selected from the group consisting of monosaccharides, oligosaccharides, 
polysaccharides, and combinations thereof. Typically, the encapsulating material is a 
starch. Suitable starches are described in EP 0 922 499; US 4,977,252; US 5,354,559 and 
US 5,935,826. 

The encapsulating material may be a microsphere made from plastic such as 
thermoplastics, acrylonitrile, methacrylonitrile, polyacrylonitrile, polymethacrylonitrile and 
mixtures thereof; commercially available microspheres that can be used are those supplied 
by Expancel of Stockviksverken, Sweden under the trademark Expancel®, and those 
supplied by PQ Corp. of Valley Forge, Pennsylvania U.S.A. under the tradename PM 6545, 
PM 6550, PM 7220, PM 7228, Extendospheres®, Luxsil®, Q-cel® and Sphericel®. 

As described herein, the proteases of the present invention find particular use in the 
cleaning industry, including, but not limited to laundry and dish detergents. These 
applications place enzymes under various environmental stresses. The proteases of the 
present invention provide advantages over many currently used enzymes, due to their 
stability under various conditions. 

Indeed, there are a variety of wash conditions including varying detergent 
formulations, wash water volumes, wash water temperatures, and lengths of wash time, to 
which proteases involved in washing are exposed. In addition, detergent formulations used 
in different geographical areas have different concentrations of their relevant components 
present in the wash water. For example, a European detergent typically has about 4500- 
5000 ppm of detergent components in the wash water, while a Japanese detergent typically 
has approximately 667 ppm of detergent components in the wash water. In North America, 
particularly the United States, detergents typically have about 975 ppm of detergent 
components present in the wash water. 

A low detergent concentration system includes detergents where less than about 800 
ppm of detergent components are present in the wash water. Japanese detergents are 
typically considered low detergent concentration system as they have approximately 667 
ppm of detergent components present in the wash water. 

A medium detergent concentration includes detergents where between about 800 
ppm and about 2000ppm of detergent components are present in the wash water. North 
American detergents are generally considered to be medium detergent concentration 
systems as they have approximately 975 ppm of detergent components present in the wash 
water. Brazil typically has approximately 1500 ppm of detergent components present in the 
wash water. 

A high detergent concentration system includes detergents where greater than about 
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2000 ppm of detergent components are present in the wash water. European detergents 
are generally considered to be high detergent concentration systems as they have 
approximately 4500-5000 ppm of detergent components in the wash water. 

Latin American detergents are generally high suds phosphate builder detergents and 
the range of detergents used in Latin America can fall in both the medium and high 
detergent concentrations as they range from 1500 ppm to 6000 ppm of detergent 
components in the wash water. As mentioned above, Brazil typically has approximately 1500 
ppm of detergent components present in the wash water. However, other high suds 
phosphate builder detergent geographies, not limited to other Latin American countries, may 
have high detergent concentration systems up to about 6000 ppm of detergent components 
present in the wash water. 

In light of the foregoing, it is evident that concentrations of detergent compositions in 
typical wash solutions throughout the world varies from less than about 800 ppm of 
detergent composition ("low detergent concentration geographies"), for example about 667 
ppm in Japan, to between about 800 ppm to about 2000 ppm ("medium detergent 
concentration geographies" ), for example about 975 ppm in U.S. and about 1500 ppm in 
Brazil, to greater than about 2000 ppm ("high detergent concentration geographies"), for 
example about 4500 ppm to about 5000 ppm in Europe and about 6000 ppm in high suds 
phosphate builder geographies. 

The concentrations of the typical wash solutions are determined empirically. For 
example, in the U.S., a typical washing machine holds a volume of about 64.4 L of wash 
solution. Accordingly, in order to obtain a concentration of about 975 ppm of detergent 
within the wash solution about 62.79 g of detergent composition must be added to the 64.4 
L of wash solution. This amount is the typical amount measured into the wash water by the 
consumer using the measuring cup provided with the detergent. 

As a further example, different geographies use different wash temperatures. The 
temperature of the wash water in Japan is typically less than that used in Europe. For 
example, the temperature of the wash water in North America and Japan can be between 
10 and 30°C (e.g., about 20 W C), whereas the temperature of wash water in Europe is 
typically between 30 and 60°C (e.g., about 40°C). 

As a further example, different geographies typically have different water hardness. 
Water hardness is usually described in terms of the grains per gallon mixed Ca 2 7Mg 2+ . 
Hardness is a measure of the amount of calcium (Ca 2+ ) and magnesium (Mg 2+ ) in the water. 
Most water in the United States is hard, but the degree of hardness varies. Moderately hard 
(60-120 ppm) to hard (121-181 ppm) water has 60 to 181 parts per million (parts per million 
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converted to grains per U.S. gallon is ppm # divided by 17.1 equals grains per gallon) of 
hardness minerals. 



Water 


Grains per gallon 


Parts per million 


Soft 


less than 1 .0 


less than 17 


Slightly hard 


1.0 to 3.5 


17 to 60 


Moderately hard 


3.5 to 7.0 


60 to 120 


Hard 


7.0 to 10.5 


120 to 180 


Very hard 


greater than 10.5 


greater than 180 



European water hardness is typically greater than 10.5 (for example 10.5-20.0) 
grains per gallon mixed Ca 2+ /Mg 2+ (e.g., about 15 grains per gallon mixed Ca 2+ /Mg ?+ ). 
North American water hardness is typically greater than Japanese water hardness, but less 
than European water hardness. For example, North American water hardness can be 
between 3 to10 grains, 3-8 grains or about 6 grains. Japanese water hardness is typically 
lower than North American water hardness, usually less than 4, for example 3 grains-per 
gallon mixed Ca 2 7Mg 2+ . 

Accordingly, in some embodiments, the present invention provides proteases that 
show surprising wash performance in at least one set of wash conditions (e.g., water 
temperature, water hardness, and/or detergent concentration). In some embodiments, the 
proteases of the present invention are comparable in wash performance to subtilisin 
proteases. In some embodiments, the proteases of the present invention exhibit enhanced 
wash performance as compared to subtilisin proteases. Thus, in some preferred 
embodiments of the present invention, the proteases provided herein exhibit enhanced 
oxidative stability, enhanced thermal stability, and/or enhanced chelator stability. 

In some preferred embodiments, the present invention provides the ASP protease, 
as well as homologues and variants fo the protease. These proteases find use in any 
applications in which it is desired to clean protein based stains from textiles or fabrics. 

In some embodiments, the cleaning compositions of the present invention are 
formulated as hand and machine laundry detergent compositions including laundry additive 
compositions, and compositions suitable for use in the pretreatment of stained fabrics, rinse 
added fabric softener compositions, and compositions for use in general household hard 
surface cleaning operations, as well as dishwashing operations. Those in the art are 
familiar with different formulations which can be used as cleaning compositions. In 
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preferred embodiments, the. proteases of the present invention comprise comparative or 
enhanced performance in detergent compositions (i.e., as compared to other proteases). In 
some embodiments, cleaning performance is evaluated by comparing the proteases of the 
present invention with subtilisin proteases in various cleaning assays that utilize enzyme- 
sensitive stains such as egg, grass, blood, milk, etc., in standard methods. Indeed, those in 
the art are familiar with the spectrophotometric and other analytical methodologies used to 
assess detergent performance under standard wash cycle conditions. 

Assays that find use in the present invention include, but are not limited to those 
described in WO 99/34011, and U.S. Pat. No. 6,605,458 (See e.g., Example 3). In U.S. 
Pat. No. 6,605,458, at Example 3, a detergent dose of 3.0 g/I at pH10.5, wash time 15 
minutes, at 15 C, water hardness of 6 Q dH, 10nM enzyme concentration in 150 ml glass 
beakers with stirring rod, 5 textile pieces (phi 2.5 cm) in 50 ml, EMPA 117 test material from 
Center for Test Materials Holland are used. The measurement of reflectance "R" on the test 
material was done at 460 nm using a Macbeth ColorEye 7000 photometer. Additional 
methods are provided in the Examples herein. Thus, these methods also find use in the 
present invention. 

The addition of proteases of the invention to conventional cleaning compositions 
does not create any special use limitation. In other words, any temperature and pH suitable 
for the detergent is also suitable for the present compositions, as long as the pH is within 
the range set forth herein, and the temperature is below the described protease's denaturing 
temperature. In addition, proteases of the present invention find use in cleaning 
compositions that do not include detergents, again either alone or in combination with 
builders and stabilizers. 

When used in cleaning compositions or detergents, oxidative stability is a further 
consideration. Thus, in some applications, the stability is enhanced, diminished, or 
comparable to subtilisin proteases as desired for various uses. In some preferred 
embodiments, enhanced oxidative stability is desired. Some of the proteases of the 
present invention find particular use in such applications. 

When used In cleaning compositions or detergents, thermal stability is a further 
consideration. Thus, in some applications, the stability is enhanced, diminished, or 
comparable to subtilisin proteases as desired for various uses. In some preferred 
embodiments, enhanced thermostability is desired. Some of the proteases of the present 
invention find particular use in such applications. 

When used in cleaning compositions or detergents, chelator stability is a further 
consideration. Thus, in some applications, the stability is enhanced, diminished, or 
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comparable to subtilisin proteases as desired for various uses. In some preferred 
embodiments, enhanced chelator stability is desired. Some of the proteases of the present 
invention find particular use in such applications. 

In some embodiments of the present invention, naturally occurring proteases are 
provided which exhibit modified enzymatic activity at different pHs when compared to 
subtilisin proteases. A pH-activity profile is a plot of pH against enzyme activity and may be 
constructed as described in the Examples and/or by methods known in the art. In some 
embodiments, it is desired to obtain naturally occurring proteases with broader profiles (i.e., 
those having greater activity at range of pHs than a comparable subtilisin protease). In 
other embodiments, the enzymes have no significantly greater activity at any pH, or naturally 
occurring homologues with sharper profiles (i.e., those having enhanced activity when 
compared to subtilisin proteases at a given pH, and lesser activity elsewhere). Thus, in 
various embodiments, the proteases of the present invention have differing pH optima 
and/or ranges. It is not intended that the present invention be limited to any specific pH or 
pH range. 

In some embodiments of the present invention, the cleaning compositions comprise, 
proteases of the present invention at a level from 0.00001 % to 10% of 69B4 and/or other 
protease of the present invention by weight of the composition and the balance (e.g., 
99.999% to 90.0%) comprising cleaning adjunct materials by weight of composition. In 
other aspects of the present invention, the cleaning compositions of the present invention 
comprise, the 69B4 and/or other proteases at a level of 0.0001 % to 10%, 0.001% to 5%, 
0.001% to 2%, 0.005% to 0.5% 69B4 or other protease of the present invention by weight of 
the composition and the balance of the cleaning composition (e.g., 99.9999% to 90.0%, 
99.999 % to 98%, 99.995% to 99.5% by weight) comprising cleaning adjunct materials. 

In some embodiments, preferred cleaning compositions, in addition to the protease 
preparation of the invention, comprise one or more additional enzymes or enzyme 
derivatives which provide cleaning performance and/or fabric care benefits. Such enzymes 
include, bui are not limited to other proteases, lipases, cutinases, amylases, cellulases, 
peroxidases, oxidases (e.g. laccases), and/or mannanases. 

Any other protease suitable for use in alkaline solutions finds use in the compositions 
of the present invention. Suitable proteases include those of animal, vegetable or microbial 
origin. In particularly preferred embodiments, microbial proteases are used. In some 
embodiments, chemically or genetically modified mutants are included. In some 
embodiments, the protease is a serine protease, preferably an alkaline microbial protease or 
a trypsin-like protease. Examples of alkaline proteases include subtilisins, especially those 
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derived from Bacillus {e.g., subtilisin, lentus, amyloliquefaciens, subtilisin Carlsberg, 
subtilisin 309, subtilisin 147 and subtilisin 168). Additional examples include those mutant 
proteases described in U.S. Pat. Nos. RE 34,606, 5,955,340, 5,700,676, 6,312,936, and 
6,482,628, all of which are incorporated herein by reference. Additional protease examples 
include, but are not limited to trypsin (e.g., of porcine or bovine origin), and the Fusarium 
protease described in WO 89/06270. Preferred commercially available protease enzymes 
include those sold under the trade names MAXATASE®, MAXACAL™, MAXAPEM™, 
OPTICLEAN®, OPTIMASE®, PROPERASE®, PURAFECT® and PURAFECT® OXP 
(Genencor), those sold under the trade names ALCALASE®, SAVINASE®, PRIMASE®, 
DURAZYM™, RELASE® and ESPERASE® (Novozymes); and those sold under the trade 
name BLAP™ (Henkel Kommanditgesellschaft auf Aktien, Duesseldorf, Germany. Various 
proteases are described in W095/23221, WO 92/21760, and U.S. Pat. Nos. 5,801,039, 
5,340,735, 5,500,364, 5,855,625. An additional BPN' variant ("BPN'-var 1* and "BPN- 
variant 1"; as referred to herein) is described in US RE 34,606. An additional GG36-variant 
("GGSe-vaM" and M GG36-variant 1"; as referred to herein) is described in US 5,955,340 
and 5,700,676. - A further GG36-variant is described in US Patents 6,312,936 and 
6,482,628. In one aspect of the present invention, the cleaning compositions of the present 
invention comprise additional protease enzymes at a level from 0.00001 % to 10% of 
additional protease by weight of the composition and 99.999% to 90.0% of cleaning adjunct 
materials by weight of composition. In other embodiments of the present invention, the 
cleaning compositions of the present invention also comprise, proteases at a level of 0.0001 
% to 10%, 0.001% to 5%, 0.001% to 2%, 0.005% to 0.5% 69B4 protease (or its homologues 
or variants) by weight of the composition and the balance of the cleaning composition {e.g., 
99.9999% to 90.0%, 99.999 % to 98%, 99.995% to 99.5% by weight) comprising cleaning 
adjunct materials. 

In addition, any lipase suitable for use in alkaline solutions finds use in the present 
invention. Suitable lipases include, but are not limited to those of bacterial or fungal origin. 
Chemically or genetically modified mutants are encompassed by the present invention. 
Examples of useful lipases include Humicola lanuginosa lipase (See e.g., EP 258 068, and 
EP 305 216), Rhizomucor miehei lipase (See e.g., EP 238 023), Candida lipase, such as C. 
antarctica lipase (e.g., the C. antarctica lipase A or B; See e.g., EP 214 761), a 
Pseudomonas lipase such as P. alcaligenes and P. pseudoalcaligenes lipase (See e.g., EP 
218 272), P. cepacia lipase (See e.g., EP 331 376), P. stutzeri lipase (See e.g., GB 
1,372,034), P. fluorescens lipase, Bacillus lipase (e.g., B. subtilis lipase [Dartois etal., 
Biochem. Biophys. Acta 1 131:253-260 [1993]); B. stearothermophilus lipase [See e.g., JP 
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64/744992]; and B. pumilus lipase [See e.g., WO 91/16422]). 

Furthermore, a number of cloned lipases find use in some embodiments of the 
present invention, including but not limited to Penicillium camembertii lipase (See, 
Yamaguchi etaL, Gene 103:61-67 [1991]), Geotricum candidum lipase (See, Schimada et 
a/., J. Biochem., 106:383-388 [1989]), and various Rhizopus lipases such as R. delemar 
lipase (See, Hass etaL, Gene 109:117-113 [1991]), a R. niveus lipase (Kugimiya etaL, 
Biosci. Biotech. Biochem. 56:716-719 [1992]) and R. oryzae lipase. 

Other types of lipolytic enzymes such as cutinases also find use in some 
embodiments of the present invention, including but not limited to the cutinase derived from 
Pseudomonas mendocina (See, WO 88/09367), or cutinase derived from Fusarium solani 
pisi (See, WO 90/09446). 

Additional suitable lipases include commercially available lipases such as M1 
LIPASE™, LUMA FAST™, and LIPOMAX™ (Genencor); LIPOLASE® and LIPOLASE® 
ULTRA (Novozymes); and LIPASE P™ n Amano n (Amano Pharmaceutical Co. Ltd.., Japan). 

In some embodiments of the present invention, the cleaning compositions of the 
present invention further comprise lipases at a level from 0.00001 % to 10% of additional 
lipase by weight of the composition and the balance of cleaning adjunct materials by weight 
of composition. In other aspects of the present invention, the cleaning compositions of the 
present invention also comprise, lipases at a level of 0.0001 % to 10%, 0.001% to 5%, 
0.001% to 2%, 0.005% to 0.5% lipase by weight of the composition. 

Any amylase (alpha and/or beta) suitable for use in alkaline solutions also find use in 
some embodiments of the present invention. Suitable amylases include, but are not limited 
to those of bacterial or fungal origin. Chemically or genetically modified mutants are 
included in some embodiments. Amylases that find use in the present invention, include, 
but are not limited to a-amylases obtained from B. licheniformis (See e.g., GB 1,296,839). 
Commercially available amylases that find use in the present invention include, but are not 
limited to DURAMYL®, TERMAMYL®, FUNGAMYL® and BAN™ (Novozymes) and 
RAPIDASE® and MAXAMYL® P (Genencor International). 

In some embodiments of the present invention, the cleaning compositions of the 
present invention further comprise amylases at a level from 0.00001 % to 10% of additional 
amylase by weight of the composition and the balance of cleaning adjunct materials by 
weight of composition. In other aspects of the present invention, the cleaning compositions 
of the present invention also comprise, amylases at a level of 0.0001 % to 10%, 0.001% to 
5%, 0.001 % to 2%, 0.005% to 0.5% amylase by weight of the composition. 

Any cellulase suitable for use in alkaline solutions find use in embodiments of the 
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present invention. Suitable cellulases include, but are not limited to those of bacterial or 
fungal origin. Chemically or genetically modified mutants are included in some 
embodiments. Suitable cellulases include, but are not limited to Humicola insolens 
cellulases (See e.g., U.S. Pat. No. 4,435,307). Especially suitable cellulases are the 
s cellulases having color care benefits (See e.g., EP 0 495 257). 

Commercially available cellulases that find use in the present include, but are not 
limited to CELLUZYME® (Novozymes), and KAC-500(B)™ (Kao Corporation). In some 
embodiments, cellulases are incorporated as portions or fragments of mature wild-type or 
variant cellulases, wherein a portion of the N-terminus is deleted (See e.g., U.S. Pat. No. 
10 5,874,276). 

In some embodiments, the cleaning compositions of the present invention can 
further comprise cellulases at a level from 0.00001 % to 10% of additional cellulase by 
weight of the composition and the balance of cleaning adjunct materials by weight of 
composition. In other aspects of the present invention, the cleaning compositions of the 

is present invention also comprise cellulases at a level of 0.0001 % to 10%, 0.001% to 5%, 
0.001 % to 2%, 0.005% to 0.5% cellulase by weight of the composition. 

Any mannanase suitable for use in detergent compositions and or alkaline solutions 
find use in the present invention. Suitable mannanases include, but are not limited to those 
of bacterial or fungal origin. Chemically or genetically modified mutants are included in some 

20 embodiments. Various mannanases are known which find use in the present invention (See 
e.g., U.S. Pat. No. 6,566,114, U.S. Pat. No.6,602,842, and US Patent No. 6,440,991 , all of 
which are incorporated herein by reference). 

In some embodiments, the cleaning compositions of the present invention can 
further comprise mannanases at a level from 0.00001 % to 10% of additional mannanase by 

25 weight of the composition and the balance of cleaning adjunct materials by weight of 
composition. In other aspects of the present invention, the cleaning compositions of the 
present invention also comprise, mannanases at a level of 0.0001 % to 10%, 0.001% to 5%, 
0.001% to 2%, 0.005% to 0.5% mannanase by weight of the composition. 

In some embodiments, peroxidases are used in combination with hydrogen peroxide 

30 or a source thereof (e.g., a percarbonate, perborate or persulfate). In alternative 

embodiments, oxidases are used in combination with oxygen. Both types of enzymes are 
used for "solution bleaching" (i.e., to prevent transfer of a textile dye from a dyed fabric to 
another fabric when the fabrics are washed together in a wash liquor), preferably together 
with an enhancing agent (See e.g., WO 94/12621 and WO 95/01426). Suitable 

35 peroxidases/oxidases include, but are not limited to those of plant, bacterial or fungal origin. 
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Chemically or genetically modified mutants are included in some embodiments. 

In some embodiments, the cleaning compositions of the present invention can 
further comprise peroxidase and/or oxidase enzymes at a level from 0.00001 % to 10% of 
additional peroxidase and/or oxidase by weight of the composition and the balance of 
cleaning adjunct materials by weight of composition. In other aspects of the present 
invention, the cleaning compositions of the present invention also comprise, peroxidase 
and/or oxidase enzymes at a level of 0.0001 % to 10%, 0.001% to 5%, 0.001% to 2%, 
0.005% to 0.5% peroxidase and/or oxidase enzymes by weight of the composition. 

Mixtures of the above mentioned enzymes are encompassed herein, in particular a 
mixture of a the 69B4 enzyme, one or more additional proteases, at least one amylase, at 
least one lipase, at least one mannanase, and/or at least one cellulase. Indeed, it is 
contemplated that various mixtures of these enzymes will find use in the present invention. 

It is contemplated that the varying levels of the protease and one or more additional 
enzymes may. both independently range to 10%, the balance of the cleaning composition 
being cleaning adjunct materials. The specific selection of cleaning adjunct materials are 
readily made by considering the surface, item, or fabric to be cleaned, and the desired form 
of the composition for the cleaning conditions during use (e.g., through the wash detergent 
use). 

Examples of suitable cleaning adjunct materials include, but are not limited to, 
surfactants, builders, bleaches, bleach activators, bleach catalysts, other enzymes, enzyme 
stabilizing systems, chelants, optical brighteners, soil release polymers, dye transfer agents, 
dispersants, suds suppressors, dyes, perfumes, colorants, filler salts, hydrotropes, 
photoactivators, fluorescers, fabric conditioners, hydrolyzable surfactants, preservatives, 
anti-oxidants, anti-shrinkage agents, anti-wrinkle agents, germicides, fungicides, color 
speckles, silvercare, anti-tarnish and/or anti-corrosion agents, alkalinity sources, solubilizing 
agents, carriers, processing aids, pigments, and pH control agents (See e.g., U.S. Pat. Nos. 
6,610,642, 6,605,458, 5,705,464, 5,710,115, 5,698,504, 5,695,679, 5,686,014 and 
5,646,101, all of which are incorporated herein by reference). Embodiments of specific 
cleaning composition materials are exemplified in detail below. 

If the cleaning adjunct materials are not compatible with the proteases of the present 
invention in the cleaning compositions, then suitable methods of keeping the cleaning 
adjunct materials and the protease(s) separated (i.e., not in contact with each other) until 
combination of the two components is appropriate are used. Such separation methods 
include any suitable method known in the art (e.g., gelcaps, encapulation, tablets, physical 
separation, etc.). 
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Preferably an effective amount of one or more protease(s) provided herein are 
included in compositions useful for cleaning a variety of surfaces in need of proteinaceous 
stain removal. Such cleaning compositions include cleaning compositions for such 
applications as cleaning hard surfaces, fabrics, and dishes. Indeed, in some embodiments, 
the present invention provides fabric cleaning compositions, while in other embodiments, the 
present invention provides non-fabric cleaning compositions. Notably, the present invention 
also provides cleaning compositions suitable for personal care, including oral care (including 
dentrifices, toothpastes, mouthwashes, etc., as well as denture cleaning compositions), skin, 
and hair cleaning compositions. It is intended that the present invention encompass 
detergent compositions in any form (I.e., liquid, granular, bar, semi-solid, gels, emulsions, 
tablets, capsules, etc.). 

By way of example, several cleaning compositions wherein the protease of the 
present invention find use are described in greater detail below. In embodiments in which 
the cleaning compositions of the present invention are formulated as compositions suitable 
for use in laundry machine washing method(s), the compositions of the present invention 
preferably contain at least one surfactant and at least one builder compound, as well as one 
or more cleaning adjunct materials preferably selected from organic polymeric compounds, 
bleaching agents, additional enzymes, suds suppressors, dispersants, lime-soap 
dispersants, soil suspension and anti-redeposition agents and corrosion inhibitors. In some 
embodiments, laundry compositions also contain softening agents (i.e., as additional 
cleaning adjunct materials). 

The compositions of the present invention also find use detergent additive products 
in solid or liquid form. Such additive products are intended to supplement and/or boost the 
performance of conventional detergent compositions and can be added at any stage of the 
cleaning process. 

In embodiments formulated as compositions for use in manual dishwashing 
methods, the compositions of the invention preferably contain at least one surfactant and 
preferably at least one additional cleaning adjunct material selected from organic polymeric 
compounds, suds enhancing agents, group II metal ions, solvents, hydrotropes and 
additional enzymes. 

In some embodiments, the density of the laundry detergent compositions herein 
ranges from 400 to 1200 g/liter, while in other embodiments, it ranges from 500 to 950 g/liter 
of composition measured at 20°C. 

In some embodiments, various cleaning compositions such as those provided in U.S, 
Pat. No. 6,605,458 find use with the proteases of the present invention. Thus, in some 
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embodiments, the compositions comprising at least one protease of the present invention is 
a compact granular fabric cleaning composition, while in other embodiments, the 
composition is a granular fabric cleaning composition useful in the laundering of colored 
fabrics, in further embodiments, the composition is a granular-fabric cleaning composition 
which provides softening through the wash capacity, in additional embodiments, the 
composition is a heavy duty liquid fabric cleaning composition. 

In some embodiments, the compositions comprising at least one protease of the 
present invention are fabric cleaning compositions such as those described in U.S. Pat. 
Nos. 6,610,642 and 6,376,450. In addition, the proteases of the present invention find use 
in granular laundry detergent compositions of particular utility under European or Japanese 
washing conditions (See e.g., U.S. Pat. No. 6,610,642). 

In alternative embodiments, the present invention provides hard surface cleaning 
compositions comprising at least one protease provided herein. Thus, in some 
embodiments, the compositions comprising at least one protease of the present invention is 
a hard surface cleaning composition such as those described in U.S. Pat. Nos. 6,610,642, 
6,376,450, and 6,376,450. 

In yet further embodiments, the present invention provides dishwashing 
compositions comprising at least one protease provided herein. Thus, in some 
embodiments, the compositions comprising at least one protease of the present invention is 
a hard surface cleaning composition such as those in U.S. Pat. Nos. 6,610,642 and 
6,376,450. 

In still further embodiments, the present invention provides dishwashing 
compositions comprising at least one protease provided herein. Thus, in some 
embodiments, the compositions comprising at least one protease of the present invention 
comprise oral care compositions such as those in U.S. Pat. No. 6,376,450, and 6,376,450. 

The formulations and descriptions of the compounds and cleaning adjunct materials 
contained in the aforementioned US Pat. Nos. 6,376,450, 6,605,458, 6,605,458, and 
6,610,642, all of which are expressly incorporated by reference herein. Still further 
examples are set forth in the Examples below. 

I) Processes of Making and Using the Cleaning Composition of the 
Present Invention 

The cleaning compositions of the present invention can be formulated into any 
suitable form and prepared by any process chosen by the formulator, non-limiting examples 
of which are described in U.S. Pat. Nos. 5,879,584, 5,691,297, 5,574,005, 5,569,645, 
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5,565,422, 5,516,448, 5,489,392, and 5,486,303, all of which are incorporated herein by 
reference. When a low pH cleaning composition is desired, the pH of such composition may 
be adjusted via the addition of a material such as monoethanolamine or an acidic material 
such as HCI. 

5 

II) Adjunct Materials In Addition to the Serine Proteases of the Present 
Invention 

While not essential for the purposes of the present invention, the non-limiting list of 
adjuncts illustrated hereinafter are suitable for use in the instant cleaning compositions and 

10 may be desirably incorporated in certain embodiments of the invention, for example to assist 
or enhance cleaning performance, for treatment of the substrate to be cleaned, or to modify 
the aesthetics of the cleaning composition as is the case with perfumes, colorants, dyes or 
the like. It is understood that such adjuncts are in addition to the serine proteases of the 
present invention. The precise nature of these additional components, and levels of 

is incorporation thereof, will depend on the physical form of the composition and the. nature of 
the cleaning operation for which it is to be. used. Suitable adjunct materials include, but are 
not limited to, surfactants, builders, chelating agents, dye transfer inhibiting agents, 
deposition aids, dispersants, additional enzymes, and enzyme stabilizers, catalytic materials, 
bleach activators, bleach boosters, hydrogen peroxide, sources of hydrogen. peroxide, 

20 preformed peracids, polymeric dispersing agents, clay soil removal/anti-redeposition agents, 
brighteners, suds suppressors, dyes, perfumes, structure elasticizing agents, fabric 
softeners, carriers, hydrotropes, processing aids and/or pigments. In addition to the 
disclosure below, suitable examples of such other adjuncts and levels of use are found in 
U.S. Patent Nos. 5,576,282, 6,306,812, and 6,326,348, that are incorporated by reference. 

25 The aforementioned adjunct ingredients may constitute the balance of the cleaning 
compositions of the present invention. 

Surfactants - The cleaning compositions according to the present invention may 
comprise a surfactant or surfactant system wherein the surfactant can be selected from 
nonionic surfactants, anionic surfactants, cationic surfactants, ampholytic surfactants, 

30 zwitterionic surfactants, semi-polar nonionic surfactants and mixtures thereof. When a low 
pH cleaning composition, such as composition having a neat pH of from about 3 to about 5, 
is desired, such composition typically does not contain alkyl ethoxylated sulfate as it is 
believed that such surfactant may be hydrolyzed by such compositions the acidic contents. 
The surfactant is typically present at a level of from about 0.1% to about 60%, from 

35 about 1 % to about 50% or even from about 5% to about 40% by weight of the subject 
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cleaning composition. 

Builders - The cleaning compositions of the present invention may comprise one or 
more detergent builders or builder systems. When a builder is used, the subject cleaning 
composition will typically comprise at least about 1%, from about 3% to about 60% or even 
5 from about 5% to about 40% builder by weight of the subject cleaning composition. 

Builders include, but are not limited to, the alkali metal, ammonium and 
alkanolammonium salts of polyphosphates, alkali metal silicates, alkaline earth and alkali 
metal carbonates, aluminosilicate builders polycarboxylate compounds, ether 
hydroxypolycarboxylates, copolymers of maleic anhydride with ethylene or vinyl methyl 
10 ether, 1, 3, 5-trihydroxy benzene-2, 4, 6-trisulphonic acid, and carboxymethyloxysuccinic 
acid, the various alkali metal, ammonium and substituted ammonium salts of polyacetic 
acids such as ethylenediamine tetraacetic acid and nitrilotriacetic acid, as well as 
polycarboxylates such as mellitic acid, succinic acid, citric acid, oxydisuccinic acid, 
polymaleic acid, benzene 1,3,5-tricarboxylic acid, carboxymethyloxysuccinic acid, and 
15 • soluble salts thereof. 

Chelating Agents - The cleaning compositions herein may contain a chelating agent, 
Suitable chelating agents include copper, iron and/or manganese chelating agents and 
mixtures thereof. 

When a chelating agent is used, the cleaning composition may comprise from about 
20 0.1% to about 15% or even from about 3.0% to about 10% chelating agent by weight of the 
subject cleaning composition. 

Deposition Aid - The cleaning compositions herein may contain a deposition aid. 
Suitable deposition aids include, polyethylene glycol, polypropylene glycol, polycarboxylate, 
soil release polymers such as polytelephthalic acid, clays such as Kaolinite, montmorillonite, 
25 atapulgite, illite, bentonite, halloysite, and mixtures thereof. 

Dve Transfer Inhibiting Agents - The cleaning compositions of the present invention 
may also include one or more dye transfer inhibiting agents. Suitable polymeric dye transfer 
inhibiting agents include, but are not limited to, polyvinylpyrrolidone polymers, polyamine N- 
oxide polymers, copolymers of N-vinylpyrrolidone and N-vinylimidazole, 
30 polyvinyloxazolidones and polyvinylimidazoles or mixtures thereof. 

When present in a subject cleaning composition, the dye transfer inhibiting agents 
may be present at levels from about 0.0001% to about 10%, from about 0.01% to about 5% 
or even from about 0.1% to about 3% by weight of the cleaning composition. 

Dispersants - The cleaning compositions of the present invention can also contain 
35 dispersants. Suitable water-soluble organic materials include the homo- or co-polymeric 
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acids or their salts, in which the polycarboxylic acid comprises at least two carboxyl radicals 
separated from each other by not more than two carbon atoms. 

Enzymes - The cleaning compositions can comprise one or more detergent enzymes 
which provide cleaning performance and/or fabric care benefits. Examples of suitable 
enzymes include, but are not limited to, hemicellulases, peroxidases, proteases, cellulases, 
xylanases, lipases, phospholipases, esterases, cutinases, pectinases, keratinases, 
reductases, oxidases, phenol oxidases, lipoxygenases, ligninases, pullulanases, tannases, 
pentosanases, malanases, B-glucanases, arabinosidases, hyaluronidase, chondroitinase, 
laccase, and amylases, or mixtures thereof. A typical combination is cocktail of 
conventional applicable enzymes like protease, lipase, cutinase and/or cellulase in 
conjunction with amylase. 

Enzyme Stabilizers - Enzymes for use in detergents can be stabilized by various 
techniques. The enzymes employed herein can be stabilized by the presence of water- 
soluble sources of calcium and/or magnesium ions in the finished compositions that provide 
such ions to the enzymes. 

Catalytic Metal Complexes - The cleaning compositions of the present invention may 
include catalytic metal complexes. One type of metal-containing bleach catalyst is a catalyst 
system comprising a transition metal cation of defined bleach catalytic activity, such as 
copper, iron, titanium, ruthenium, tungsten, molybdenum, or manganese cations, an 
auxiliary metal cation having little or no bleach catalytic activity, such as zinc or aluminum 
cations, and a sequestrate having defined stability constants for the catalytic and auxiliary 
metal cations, particularly ethylenediaminetetraacetic acid, ethylenediaminetetra 
(methylenephosphonic acid) and water-soluble salts thereof. Such catalysts are disclosed in 
U.S. Pat. No. 4,430,243. 

If desired, the compositions herein can be catalyzed by means of a manganese 
compound. Such compounds and levels of use are well known in the art and include, for 
example, the manganese-based catalysts disclosed in U.S. Pat. No. 5,576,282. 

Cobalt bleach catalysts useful herein are known, and are described, for example, in 
U.S. Pat. Nos. 5,597,936, and 5,595,967. Such cobalt catalysts are readily prepared by 
known procedures, such as taught for example in U.S. Pat. Nos. 5,597,936, and 5,595,967. 

Compositions herein may also suitably include a transition metal complex of a 
macropolycyclic rigid ligand - abbreviated as "MRL". As a practical matter, and not by way 
of limitation, the compositions and cleaning processes herein can be adjusted to provide on 
the order of at least one part per hundred million of the active MRL species in the aqueous 
washing medium, and will preferably provide from about 0.005 ppm to about 25 ppm, more 
preferably from about 0.05 ppm to about 10 ppm, and most preferably from about 0.1 ppm 
to about 5 ppm, of the MRL in the wash liquor. 
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Preferred transition-metals in the instant transition-metal bleach catalyst include 
manganese, iron and chromium. Preferred MRUs herein are a special type of ultra-rigid 
ligand that is cross-bridged such as 5,12-diethyl-1,5,8,12-tetraazabicyclo[6.6.2]hexadecane. 

Suitable transition metal MRLs are readily prepared by known procedures, such as 
taught for example in WO 00/332601 , and U.S. Pat. No. 6,225,464. 

III) Processes of Making and Using Cleaning Compositions 

The cleaning compositions of the present invention can be formulated into any 
suitable form and prepared by any process chosen by the formulator, non-limiting examples 
of which are described in U.S. Pat. Nos. 5,879,584, 5,691,297, 5,574,005, 5,569,645, 
5,516,448, 5,489,392, and 5,486,303, all of which are incorporated herein by reference. 

IV) Method of Use 

The cleaning compositions disclosed herein of can be used to clean a situs inter alia 
a surface or fabric. Typically at least a portion of the situs is contacted with an embodiment 
of the present cleaning composition, in neat form or diluted in a wash liquor, and then the 
situs is optionally washed and/or rinsed. For purposes of the present invention, washing 
includes but is not limited to, scrubbing, and mechanical agitation. The fabric may comprise 
most any fabric capable of being laundered in normal consumer use conditions. The 
disclosed cleaning compositions are typically employed at concentrations of from about 500 
ppm to about 15,000 ppm in solution. When the wash solvent is water, the water 
temperature typically ranges from about 5°C to about 90°C and, when the situs comprises a 
fabric, the water to fabric mass ratio is typically from about 1:1 to about 30:1. 

B. Animal Feed 

Still further, the present invention provides compositions and methods for the 
production of a food or animal feed, characterized in that protease according to the 
invention is mixed with food or animal feed. In some embodiments, the protease is added 
as a dry product before processing, while in other embodiments it is added as a liquid before 
or after processing. In some embodiments, in which a dry powder is used, the enzyme is 
diluted as a liquid onto a dry carrier such as milled grain. The proteases of the present 
invention find use as components of animal feeds and/or additives such as those described 
U.S. Pat. No. 5,612,055, U.S. Pat. No. 5,314,692. and U.S. Pat No. 5,147,642, all of which 
are hereby incorporated by reference. 

The enzyme feed additive according to the present invention is suitable for 
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preparation in a number of methods. For example, in some embodiments, it is prepared 
simply by mixing different enzymes having the appropriate activities to produce an enzyme 
mix. In some embodiments; this enzyme mix is mixed directly with a feed, while in other 
embodiments, it is impregnated onto a cereal-based carrier material such as milled wheat, 
maize or soya flour. The present invention also encompasses these impregnated carriers, 
as they find use as enzyme feed additives. 

In some alternative embodiments, a cereal-based carrier (e.g., milled wheat or 
maize) is impregnated either simultaneously or sequentially with enzymes having the 
appropriate activities. For example, in some embodiments, a milled wheat carrier is first 
sprayed with a xylanase, secondly with a protease, and optionally with a (3-glucanase. The 
present invention also encompasses these impregnated carriers, as they find use as 
enzyme feed additives. In preferred embodiments, these impregnated carriers comprise at 
least one protease of the present invention. 

In some embodiments, the feed additive of the present invention is directly mixed 
with the animal feed, while in alternative embodiments, it is mixed with one or more other 
feed additives such as a vitamin feed additive, a mineral feed additive, and/or an amino acid 
feed additive. The resulting feed additive including several different types of components is 
then mixed in an appropriate, amount with the feed. 

In some preferred embodiments, the feed additive of the present invention, including 
cereal-based carriers is normally mixed in amounts of 0.01-50 g per kilogriam of feed, more 
preferably 0.1-10 g/kilogram, and most preferably about 1 g/kilogram. 

In alternative embodiments, the enzyme feed additive of the present invention 
involves construction of recombinant microorganisms that produces the desired enzyme(s) 
in the desired relative amounts. In some embodiments, this is accomplished by increasing 
the copy number of the gene encoding at least one protease of the present invention, and/or 
by using a suitably strong promoter operatively linked to the polynucleotide encoding the 
protease(s). In further embodiments, the recombinant microorganism strain has certain 
enzyme activities deleted (e.g., cellulases, endoglucanases, etc.), as desired. 

In additional embodiments, the enzyme feed additives provided by the present 
invention also include other enzymes, including but not limited to at least one xylanase, a- 
amylase, glucoamylase, pectinase, mannanase, a-galactosidase, phytase, and/or lipase. In 
some embodiments, the enzymes having the desired activities are mixed with the xylanase 
and protease either before impregnating these on a cereal-based carrier or alternatively 
such enzymes are impregnated simultaneously or sequentially on such a cereal-based 
carrier. The carrier is then in turn mixed with a cereal-based feed to prepare the final feed. 
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In alternative embodiments, the enzyme feed additive is formulated as a solution of the 
individual enzyme activities and then mixed with a feed material pre-formed as pellets or as 
a mash. 

In still further embodiments, the enzyme feed additive is included in animals' diets by 
incorporating it into a second (i.e., different) feed or the animals' drinking water. 
Accordingly, it is not essential that the enzyme mix provided by the present invention be 
incorporated into the cereal-based feed itself, although such incorporation forms a 
particularly preferred embodiment of the present invention. The ratio of the units of 
xylanase activity per g of the feed additive to the units of protease activity per g of the feed 
additive is preferably 1:0.001-1,000, more preferably 1:0.01-100, and most preferably 1:0.1- 
10. As indicated above, the enzyme mix provided by the present invention is preferably 
finds use as a feed additive in the preparation of a cereal-based feed. 

In some embodiments, the cereal-based feed comprises at least 25% by weight, or 
more preferably at least 35% by weight, wheat or maize or a combination of both of these 
cereals. The feed further comprises a protease (/.a, at least one protease of the present 
invention) in such an amount that the feed includes a protease in such an amount that the 
feed includes 1P0-1 00,000 units of protease activity per kg. 

Cereal-based feeds provided the present invention according to the present 
invention find use as feed for a variety of non-human animals, including poultry (e.g., 
turkeys, geese, ducks, chickens, etc.), livestock (e.g., pigs, sheep, cattle, goats, etc.), and 
companion animals (e.g., horses, dogs, cats, rabbits, mice, etc.). The feeds are particularly 
suitable for poultry and pigs, and in particular broiler chickens. 

C. Textile and Leather Treatment 

The present invention also provides compositions for the treatment of textiles that 
include at least one of the proteases of the present invention. In some embodiments, at 
least one protease of the present invention is a component of compositions suitable for the 
treatment of silk or wool (See e.g., U.S. RE Pat. No. 216,034, EP 134,267, U.S. Pat. No. 
4,533,359, and EP 344,259). 

In addition, the proteases of the present invention find use in a variety of applications 
where it is desirable to separate phosphorous from phytate. Accordingly, the present 
invention also provides methods producing wool or animal hair material with improved 
properties. In some preferred embodiments, these methods comprise the steps of 
pretreating wool, wool fibres or animal hair material in a process selected from the group 
consisting of plasma treatment processes and the Delhey process; and subjecting the 
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pretreated wool or animal hair material to a treatment with a proteolytic enzyme (e.g., at 
least one protease of the present invention) in an amount effective for improving the 
properties. In some embodiments, the proteolytic enzyme treatment occurs prior to the 
plasma treatment, while in other embodiments, it occurs after the plasma treatment. In 
some further embodiments, it is conducted as a separate step, while in other embodiments, 
it is conducted in combination with the scouring or the dyeing of the wool or animal hair 
material. In additional embodiments, at least one surfactant and/or at least one softener is 
present during the enzyme treatment step, while in other embodiments, the surfactant(s) 
and/or softener(s) are incorporated in a separate step wherein the wool or animal hair 
material is subjected to a softening treatment. 

In some embodiments, the compositions of the present invention find us in methods 
for shrink-proofing wool fibers (See e.g., JP 4-327274). In some embodiments, the 
compositions are used in methods for shrink-proofing treatment of wool fibers by subjecting 
the fibers to a low-temperature plasma treatment, followed by treatment with a shrink- 
proofing resin such as a block-urethane resin, polyamide epochlorohydrin resin, glyoxalic 
resin, ethylene-urea resin or acrylate resin, and then treatment with a weight reducing 
proteolytic enzyme for obtaining a softening effect). In some embodiments, the plasma 
treatment step is a low-temperature treatment, preferably a corona discharge treatment or a 
glow discharge treatment. 

In some embodiments, the low-temperature plasma treatment is carried, out by using 
a gas, preferably a gas selected from the group consisting of air, oxygen, nitrogen, 
ammonia, helium, or argon. Conventionally, air is used but it may be advantageous to use 
any of the other indicated gasses. 

Preferably, the low-temperature plasma treatment is carried out at a pressure 
between about 0.1 torr and 5 torr for from about 2 seconds to about 300 seconds, preferably 
for about 5 seconds to about 100 seconds, more preferably from about 5 Seconds to about 
30 seconds. 

As indicated above, the present invention finds use in conjunction with methods such 
as the Delhey process (See e.g., DE-A-43 32 692). In this process, the wool is treated in an 
aqueous solution of hydrogen peroxide in the presence of soluble wolframate, optionally 
followed by treatment in a solution or dispersion of synthetic polymers, for improving the 
anti-felting properties of the wool. In this method, the wool is treated in an aqueous solution 
of hydrogen peroxide (0.1-35% (w/w), preferably 2-10% (w/w)), in the presence of a 2-60% 
(w/w), preferably 8-20% (w/w) of a catalyst (preferably Na 2 W0 4 ), and in the presence of a 
nonionic wetting agent. Preferably, the treatment is carried out at pH 8-1 1 , and room 
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temperature. The treatment time depends on the concentrations of hydrogen peroxide and 
catalyst, but is preferably 2 minutes or less. After the oxidative treatment, the wool is rinsed 
with water. For removal of residual hydrogen peroxide, and optionally for additional 
bleaching, the wool is further treated in acidic solutions of reducing agents (e.g., sulfites, 
phosphites etc.). 

In some embodiments, the enzyme treatment step carried out for between about 1 
minute and about 120 minutes. This step is preferably carried out at a temperature of 
between about 20°C. and about 60°C., more preferably between about 30°C. and about 
50°C. Alternatively, the wool is soaked in or padded with an aqueous enzyme solution and 
then subjected to steaming at a conventional temperature and pressure, typically for about 
30 seconds to about 3 minutes. In some preferred embodiments, the proteolytic enzyme 
treatment is carried out in an acidic or neutral or alkaline medium which may include a 
buffer. 

In alternative embodiments, the enzyme treatment step is conducted in the presence 
of one or more conventional anionic, non-ionic {e.g.; Dobanol; Henkel AG) or cationic 
surfactants. An example of a useful nonionic surfactant is Dobanol (from Henkel AG). In 
further embodiments, the wool or animal hair material is subjected to an ultrasound 
treatment, either prior to or simultaneous with the treatment with a proteolytic enzyme. In 
some preferred embodiments, the ultrasound treatment is carried out at a temperature of 
about 50°C for about 5 minutes. In some preferred embodiments, the amount of proteolytic 
enzyme used in the enzyme treatment step is between about 0.2 w/w % and about 10 w/w 
%, based on the weight of the wool or animal hair material. In some embodiments, in order 
to the number of treatment steps, the enzyme treatment is carried out during dyeing and/or 
scouring of the wool or animal hair material, simply by adding the protease to the dyeing, 
rinsing and/or scouring bath. In some embodiments, enzyme treatment is carried out after 
the plasma treatment but in other embodiments, the two treatment steps are carried out in 
the opposite order. 

Softeners conventionally used on wool are usually cationic softeners, either organic 
cationic softeners or silicone based products, but anionic or non-ionic softeners are also 
useful. Examples of useful softeners include, but are not limited to polyethylene softeners 
and silicone softeners (i.e., dimethyl polysiloxanes (silicone oils)), H-polysiloxanes, silicone 
elastomers, aminofunctional dimethyl polysiloxanes, aminofunctional silicone elastomers, 
and epoxyfunctional dimethyl polysiloxanes, and organic cationic softeners (e.g. alkyl 
quarternary ammonium derivatives). 
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In additional embodiments, the present invention provides compositions for the 
treatment of an animal hide that includes at least one protease of the present invention. In 
some embodiments, the proteases of the present invention find use in compositions for 
treatment of animal hide, such as those described in WO 03/00865 (Insect Biotech Co., 
Taejeon-Si, Korea). In additional embodiments, the present invention provides methods for 
processing hides and/or skins into leather comprising enzymatic treatment of the hide or 
skin with the protease of the present invention (See e.g., WO 96/1 1285). In additional 
embodiments, the present invention provides compositions for the treatment of an animal 
skin or hide into leather that includes at least one protease of the present invention. 

Hides and skins are usually received in the tanneries in the form of salted or dried 
raw hides or skins. The processing of hides or skins into leather comprises several different 
process steps including the steps of soaking, unhairing and bating. These steps constitute 
the wet processing and are performed in the beamhouse. Enzymatic treatment utilizing the 
proteases of the present invention are applicable at any time during the process involved in 
the processing of leather. However, proteases are usually employed during the wet 
processing (i.e., during soaking, unhairing and/or bating). Thus, in some preferred 
embodiments, the enzymatic treatment with at least one of the proteases of the present 
invention occurs during the wet processing stage. 

In some embodiments, the soaking processes of the present invention are 
performed under conventional soaking conditions (e.g., at a pH in the range pH 6.0 - 11). 
In some preferred embodiments, the range is pH 7.0 -10.0. in alternative embodiments, 
the temperature is in the range of 20-30 e C, while in other embodiments it is preferably in 
the range 24-28 9 C. In yet further embodiments, the reaction time is in the range 2-24 
hours, while preferred range is 4-16 hours. In additional embodiments, tensides and/or 
preservatives are provided as desired. 

The second phase of the bating step usually commences with the addition of the 
bate itself. In some embodiments, the enzymatic treatment takes place during batfng. In 
some preferred embodiments, the enzymatic treatment takes place during bating, after the 
deliming phase. In some embodiments, the bating process of the presents invention is 
performed using conventional conditions (e.g., at a pH in the range pH 6.0 -9.0). In some 
preferred embodiments, the pH range is 6.0 to 8.5. In further embodiments, the 
temperature is in the range of 20-30 9 C, while in preferred embodiments, the temperature is 
in the range of 25-28 s C. In some embodiments, the reaction time is in the range of 20-90 
minutes, while in other embodiments, it is in the range 40-80 minutes. Processes for the 
manufacture of leather are well known to those skilled in the art (See e.g., WO 94/069429 
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WO 90/1121 189, U.S. Pat. No. 3,840,433, EP 505920, GB 2233665, and U.S. Pat. No. 
3,986,926, all of which are herein incorporated by reference). 

In further embodiments, the present invention provides bates comprising at least one 
protease of the present invention. A bate is an agent or an enzyme-containing preparation 
comprising the chemically active ingredients for use in beamhouse processes, in particular 
in the bating step of a process for the manufacture of leather. In some embodiments, the 
present invention provides bates comprising protease and suitable excipients. In some 
embodiments, agents including, but not limited to chemicals known and used in the art, e.g. 
diluents, emulgators, delimers and carriers. In some embodiments, the bate comprising at 
least one protease of the present invention is formulated as known in the art {See e.g. , GB- 
A2250289, WO 96/1 1 285, and EP 0784703). 

In some embodiments, the bate of the present invention contains from 0.00005 to 
0.01 g of active protease per g of bate, while in other embodiments, the bate contains from 
0.0002 to 0.004 g of active protease per g of bate. 

Thus, the proteases of the present invention find use in numerous applications and 
settings. 

EXPERIMENTAL 

The present invention is described in further detail in the following Examples which 
are not in any way intended to limit the scope of the invention as claimed. The attached 
Figures are meant to be considered as integral parts of the specification and description of 
the invention. All references cited are herein specifically incorporated by reference for all 
that is described therein. The following Examples are offered to illustrate, but not to limit the 
claimed invention 

In the experimental disclosure which follows, the following abbreviations apply: PI 
(proteinase inhibitor), ppm (parts per million); M (molar); mM (millimolar); pM (micromolar); 
nM (nanomolar); mol (moles); mmol (millimoles); pmol (micromoles); nmol (nanomoles); gm 
(grams); mg (milligrams); pg (micrograms); pg (picograms); L (liters); ml and mL (milliliters); 
pi and mL (microliters); cm (centimeters); mm (millimeters); pm (micrometers); nm 
(nanpmeters); U (units); V (volts); MW (molecular weight); sec (seconds); min(s) 
(minute/minutes); h(s) and hr(s) (hour/hours); °C (degrees Centigrade); QS (quantity 
sufficient); ND (not done); NA (not applicable); rpm (revolutions per minute); H 2 0 (water); 
dH 2 0 (deionized water); (HCI (hydrochloric acid); aa (amino acid); bp (base pair); kb 
(kilobase pair); kD (kilodaltons); cDNA (copy or complementary DNA); DNA 
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(deoxyribonucleic acid); ssDNA (single stranded DNA); dsDNA (double stranded DNA); 
dNTP (deoxyribonucleotide triphosphate); RNA (ribonucleic acid); MgCI 2 (magnesium 
chloride); NaCI (sodium chloride); w/v (weight to volume); v/v (volume to volume); g 
(gravity); OD (optical density); Dulbecco's phosphate buffered solution (DPBS); SOC (2% 
Bacto-Tryptone, 0.5% Bacto Yeast Extract, 10 mM NaCI, 2.5 mM KCI); Terrific Broth (TB; 12 
g/l Bacto Tryptone, 24 g/l glycerol, 2.31 g/l KH 2 P0 4 , and 12.54 g/l K 2 HP0 4 ); ODzso (optical 
density at 280 nm); ODeoo (optical density at 600 nm); A406 (absorbance at 405 nm); Vmax 
(the maximum initial velocity of an enzyme catalyzed reaction); PAGE (polyacrylamide gel 
electrophoresis); PBS (phosphate buffered saline [150 mM NaCI, 10 mM sodium phosphate 
buffer, pH 7.2]); PBST (PBS+0.25% TWEEN® 20); PEG (polyethylene glycol); PCR 
(polymerase chain reaction); RT-PCR (reverse transcription PCR); SDS (sodium dodecyl 
sulfate); Tris (tris(hydroxymethyl)aminomethane); HEPES (N-[2-Hydroxyethyl]piperazine- 
N-[2-ethanesulfonic acid]); HBS (HEPES buffered saline); SDS (sodium dodecylsulfate); 
Tris-HCI (tris[Hydroxymethyl]aminomethane-hydrochloride); Tricine (N-[tris-(hydroxymethyl)- 
methyl]-glycine); CHES (2-(N-cyclo-hexylamino) ethane-sulfonic acid); TAPS (3-{[tris- 
(hydroxymethyl)-methyl]-amino}-propanesulfonic acid); CAPS (3-(cyclo-hexylamino)- 
propane-sulfonic acid; DMSO (dimethyl sulfoxide); DTT (1,4-dithio-DL-threitol); SA (sinapinic 
acid (s,5-dimethoxy-4-hydroxy cinnamic acid); TCA (trichloroacetic acid); Glut and GSH 
(reduced glutathione); GSSG (oxidized glutathione); TCEP (Tris[2-carboxyethyl] phosphine); 
Ci (Curies); mCi (milliCuries); pCi (microCuries); HPLC (high pressure liquid 
chromatography); RP-HPLC (reverse phase high pressure liquid chromatography); TLC 
(thin layer chromatography); MALDI-TOF (matrix-assisted laser desorption/ionization-time 
of flight); Ts (tosyl); Bn (benzyl); Ph (phenyl); Ms (mesyl); Et (ethyl), Me (methyl); Taq 
(Thermus aquaticus DNA polymerase); Klenbw (DNA polymerase I large (Klenow) 
fragment); rpm (revolutions per minute); EGTA (ethylene glycol-bis(G-aminoethyl ether) N, 
N, N', N'-tetraacetic acid); EDTA (ethylenediaminetetracetic acid); bla (P-lactamase or 
ampicillin-resistance gene); HDL (heavy duty liquid detergent, i.e., laundry detergent); MJ 
Research (MJ Research, Reno,NV); Baseclear (Baseclear BV, Inc., Leiden, the 
Netherlands); PerSeptive (PerSeptive Biosystems, Framingham, MA); ThermoFinnigan 
(ThermoFinnigan, San Jose, CA); Argo (Argo BioAnalytica, Morris Plains, NJ);Seitz EKS 
(SeitzSchenk Filtersystems GmbH, Bad Kreuznach, Germany); Pall (Pall Corp., East Hills, 
NY); Spectrum (Spectrum Laboratories, Dominguez Rancho, CA); Molecular Structure 
(Molecular Structure Corp., Woodlands, TX); Accelrys (Accelrys, Inc., San Diego, CA); 
Chemical Computing (Chemical Computing Corp., Montreal, Canada); New Brunswick (New 
Brunswick Scientific, Co., Edison, NJ); CFT (Center for Test Materials, Vlaardingeng, the 
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Netherlands); Procter & Gamble (Procter & Gamble, Inc., Cincinnati, OH); GE Healthcare 
(GE Healthcare, Chalfont St. Giles, United Kingdom); DNA2.0 (DNA2.0, Menlo Park, CA); 
OXOID (Oxoid, Basingstoke, Hampshire, UK); Megazyme (Megazyme International Ireland 
Ltd., Bray Business Park, Bray, Co., Wicklow, Ireland); Finnzymes (Finnzymes Oy, Espoo, 
Finland); Kelco (CP Kelco, Wilmington, DE); Corning (Corning Life Sciences, Corning, NY); 
(NEN (NEN Life Science Products, Boston, MA); Pharma AS (Pharma AS, Oslo, Norway); 
Dynal (Dynal, Oslo, Norway); Bio-Synthesis (Bio-Synthesis, Lewisville, TX); ATCC 
(American Type Culture Collection, Rockville, MD); Gibco/BRL (Gibco/BRL, Grand Island , 
NY); Sigma (Sigma Chemical Co., St. Louis, MO); Pharmacia (Pharmacia Biotech, 
Piscataway, NJ); NCBI (National Center for Biotechnology Information); Applied Biosystems 
(Applied Biosystems, Foster City, CA); BD Biosciences and/or Clontech (BD Biosciences 
CLONTECH Laboratories, Palo Alto, CA); Operon Technologies (Operon Technologies, 
Inc., Alameda, CA); MWG Biotech (MWG Biotech, High Point, NC); Oligos Etc (Oligos Etc. 
Inc, Wilsonville, OR); Bachem (Bachem Bioscience, Inc., King of Prussia, PA); Difco (Difco 
Laboratories, Detroit, Ml); Mediatech (Mediatech, Herndon, VA; Santa Cruz (Santa Cruz 
Biotechnology, Inc., Santa Cruz, CA); Oxoid (Oxoid Inc., Ogdensburg, NY); Worthington 
(Worthington Biochemical Corp., Freehold, NJ); GIBCO BRL or Gibco BRL (Life 
Technologies, Inc., Gaithersburg, MD); Millipore (Millipore, Billerica, MA); Bio-Rad (Bio-Rad, 
Hercules, CA); Invitrogen (Invitrogen Corp., San Diego, CA); NEB (New England Biolabs, 
Beverly, MA); Sigma (Sigma Chemical Co., St. Louis, MO); Pierce (Pierce Biotechnology, 
Rockford, IL); Takara (Takara Bio Inc., Otsu, Japan); Roche (Hoffmann-La Roche, Basel, 
Switzerland); EM Science (EM Science, Gibbstown, NJ); Qiagen (Qiagen, Inc., Valencia, 
CA); Biodesign (Biodesign Intl., Saco, Maine); Aptagen (Aptagen, Inc., Herndon, VA); 
Sorvall (Sorvall brand, from Kendro Laboratory Products, Asheville, NC); Molecular Devices 
(Molecular Devices, Corp., Sunnyvale, CA); R&D Systems (R&D Systems, Minneapolis, 
MN); Stratagene (Stratagene Cloning Systems, La Jolla, CA); Marsh (Marsh Biosciences, 
Rochester, NY); Bio-Tek (Bio-Tek Instruments, Winooski, VT); (Biacore (Biacore, Inc., 
Piscataway, NJ); PeproTech (PeproTech, Rocky Hill, NJ); SynPep (SynPep, Dublin, CA); 
New Objective (New Objective brand; Scientific Instrument Services, Inc., Ringoes, NJ); 
Waters (Waters, Inc., Milford, MA); Matrix Science (Matrix Science, Boston, MA); Dionex 
(Dionex, Corp., Sunnyvale, CA); Monsanto (Monsanto Co., St. Louis, MO); Wintershall 
(Wintershall AG, Kassel, Germany); BASF (BASF Co., Florham Park, NJ); Huntsman 
(Huntsman Petrochemical Corp., Salt Lake City, UT); Enichem (Enichem Iberica, Barcelona, 
Spain); Fluka Chemie AG (Fluka Chemie AG, Buchs, Switzerland); Gist-Brocades (Gist- 
Brocades, NV, Delft, the Netherlands); Dow Corning (Dow Corning Corp., Midland, Ml); and 
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Microsoft (Microsoft, Inc., Redmond, WA). 

EXAMPLE 1 
Assays 

In the following Examples, various assays were used, such as protein 
determinations, application-based tests, and stability-based tests. For ease in reading, the 
following assays are set forth below and referred to in the respective Examples. Any 
deviations from the protocols provided below in any of the experiments performed during the 
development of the present invention are indicated in the Examples. 

Some of the detergents used in the following Examples had the following 
compositions. In Compositions I and II, the balance (to 100%) is perfume/dye and/or water. 
The pH of these compositions was from about 5 to about 7 for Composition I, and about 7.5 
to about 8.5 Composition II. In Composition III, the balance (to 100%) comprised of water 
and/or the minors perfume, dye, brightener/SRPI/sodium 

carboxymethylcellulose/photobleach/MgSoVPVPVI/suds suppressor/high molecular 
PEG/clay. 



DETERGENT COMPOSITIONS 
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7.0 


CFAA 




4.0 


Ci 2 -C 14 Fatty alcohol ethoxylate 


12.0 


1.0 


C 12 -Cia Fatty acid 


3.0 


4.0 


Citric acid (anhydrous) 


6.0 


3.0 


DETPMP 




1.0 


Monoethanolamirie 


5.0 


5.0 


Sodium hydroxide 




1.0 


1 N HCI aqueous solution 


#1 
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Propanediol 


12.7 


10. 


Ethanol 


1.8 


5.4 


DTPA 


0.5 


0.4 


Pectin Lyase 




0.005 


Lipase 


0.1 




Amylase 


0.001 




Cellulase 




0.0002 


Protease A 






Aldose Oxidase 






DETBCHD 




0.01 


SRP1 


0.5 


0.3 


Boric acid 


2.4 


2.8 


Sodium xylene sulfonate 






DC 3225C 


1.0 


1.0 


2-butyl-octanol 


0.03 


0.03 


Brightener 1 


0.12 


0.08 



Composition III 

C 14 -CisAS or sodium tallow alkyl 

sulfate 

LAS 

C12-C15AE3S 

Ci2-C, S E S or E 3 

QAS 

Zeolite A 

SKS-6 (dry add) 

MA/AA 

AA 

3Na Citrate 2H z O 

Citric Acid (Anhydrous) 

DTPA 

EDDS 

HEDP 

PB1 



3.0 

8.0 

I. 0 
5.0 

II. 0 
9.0 
2.0 



1.5 

0.5 
0.2 
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Composition III 

Percarbonate 



3.8 



NOBS 



NACA OBS 



2.0 



TAED 



2.0 



BB1 



0.34 



BB2 



Anhydrous Na Carbonate 
Sulfate 



2.0 



8.0 



Silicate 
Protease B 

Protease C - 

Lipase 

Amylase 

Cellulase 

Pectin Lyase 0.001 
Aldose Oxidase 0.05 
PAAC 

A. TCA Assay for Protein Content Determination in 96-well Microliter Plates 

This assay was started using filtered culture supernatant from microtiter plates grown 
4 days at 33 °C with shaking at 230 RPM and humidified aeration. A fresh 96-well flat 
bottom plate was used for the assay. First, 100 pL/well of 0.25 N HCI were placed in the 
wells. Then, 50 (JL filtered culture broth were added to the wells. The light 
scattering/absorbance at 405 nm (use 5 sec mixing mode in the plate reader) was then 
determined, in order to provide the "blank" reading. 

For the test, 100 pL/well 15% (w/v) TCA was placed in the plates and incubated 
between 5 and 30 min at room temperature. The light scattering/absorbance at 405 nm 
(use 5 sec mixing mode in the plate reader) was then determined. 

The calculations were performed by subtracting the blank (/.e., no TCA) from the test 
reading with TCA. If desired, a standard curve can be created by calibrating the TCA 
readings with AAPF assays of clones with known conversion factors. However, the TCA 
results are linear with respect to protein concentration from 50 to 500 ppm and can thus be 
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plotted directly against enzyme performance for the purpose of choosing good-performing 
variants. 

B. suc-AAPF-pNA Assay of Proteases in 96- we 1 1 Microtiter Plates 

In this assay system, the reagent solutions used were: 

1 . 100 mM Tris/HCI, pH 8.6, containing 0.005% TWEEN®-80 (Tris buffer) 

2. 100 mM Tris buffer, pH 8.6, containing 10 mM CaCI 2 and 0.005% TWEEN®-80 (Tris buffer) 

3. 160 mM suc-AAPF-pNA in DMSO (suc-AAPF-pNA stock solution) (Sigma: S-7388) 

To prepare suc-AAPF-pNA working solution, 1 ml AAPF stock was added to 100 ml 
Tris/Ca buffer and mixed well for at least 10 seconds. 

The assay was performed by adding 10 jjJ of diluted protease solution to each well, 
followed by the addition (quickly) of 190 pi 1 mg/ml AAPF-working solution. The 
solutions were mixed for 5 sec, and the absorbance change was read at 410 nm in 
an MTP reader, at 25°C. The protease activity was expressed as AU (activity = 
SOD-min 1 .ml" 1 ). 

C. Keratin Hydrolysis Assay 

In this assay system, the chemical and reagent solutions used were: 

Keratin ICN 902111 

Detergent Detergent Composition II 



1 .6 g. detergent is dissolved in 1000 ml water (pH = 8.2) 
0.6 ml. CaCI2/MgCI2 of 10,000 gpg is added as well as 1 190 mg 
HEPES, giving a hardness and buffer strength of 6 gpg and 5 mM 
respectively. The pH is adjusted to 8.2 with NaOH. 



Picrylsulfonic acid (TNBS) 



Reagent B 



Reagent A 



Sigma P-2297 (5% solution in water) 

45.4 g Na 2 B 4 O 7 .10 H20 (Merck 6308) and 15 ml of 4N NaOH are 
dissolved together to a final volume of 1000 ml (by heating if needed) 
35.2 g NaH 2 P0 4 .1H 2 0 (Merck 6346) and 0.6 g Na 2 S0 3 (Merck 6657) 
are dissolved together to a final volume of 1000 ml. 



Method: 

Prior to the incubations, keratin was sieved on a 100 pm sieve in small portions at a 
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time. Then, 10 g of the < 100 pm keratin was stirred in detergent solution for at least 20 
minutes at room temperature with regular adjustment of the pH to 8.2. Finally, the 
suspension was centrifuged for 20 minutes at room temperature (Sorvall, GSA rotor, 13,000 
rpm). This procedure was then repeated. Finally, the wet sediment was suspended in 
detergent to a total volume of 200 ml., and the suspension was kept stirred during pipetting. 
Prior to incubation, microtiter plates (MTPs) were filled with 200 pi substrate per well with a 
Biohit multichannel pipette and 1200 pi tip (6 dispenses of 200 pi and dispensed as fast as 
possible to avoid settling of keratin in the tips). Then, 10pl of the filtered culture was added 
to the substrate containing MTPs. The plates were covered with tape, placed in an incubator 
and incubated at 20 °C for 3 hours at 350 rpm (Innova 4330 [New Brunswick]). Following 
incubation, the plates were centrifuged for 3 minutes at 3000 rpm (iSigma 6K 15 centrifuge). 
About 15 minutes before removal of the 1 st plate from the incubator, the TNBS reagent was 
prepared by mixing 1 ml TNBS solution per 50 ml of reagent A. 

MTPs were filled with 60 pi TNBS reagent A per well. From the incubated plates, 10 
pi was transferred to the MTPs with TNBS reagent A. The plates were covered with tape 
and shaken for 20 minutes in a bench shaker (BMG Thermostar) at room temperature and 
500 rpm. Finally, 200 pi of reagent B was added to the wells, mixed for 1 minute on a 
shaker, and the absorbance at 405 nm was measured with the MTP-reader. 

Calculation of the Keratin Hydroiyzing Activity: 

The obtained absorbance value was corrected for the blank value (substrate without 
enzyme). The resulting absorbance provides a measure for the hydrolytic activity. For each 
sample (variant) the performance index was calculated. The performance index compares 
the performance of the variant (actual value) and the standard enzyme (theoretical value) at 
the same protein concentration. In addition, the theoretical values can be calculated, using 
the parameters of the Langmuir equation of the standard enzyme. A performance index (PI) 
that is greater than 1 (Pl>1) identifies a better variant (as compared to the standard [e.g., 
wild-type]), while a PI of 1 (Pl=1) identifies a variant that performs the same as the standard, 
and a PI that is less than 1 (Pl<1) identifies a variant that performs worse than the standard. 
Thus, the PI identifies winners, as well as variants that are less desirable for use under 
certain circumstances. 

D. Mlcroswatch Assay for Testing Protease Performance 

All of the detergents used in these assays did not contain enzymes. 



WO 2005/052146 



PCT/US2004/039066 



-111- 

Detergent Preparations: 

1. European Detergent Solution: 

Milli-Q water was adjusted to 15 gpg water hardness (Ca/Mg=4/1), add 7.6 g/l 
ARIEL® Regular detergent and stir the detergent solution vigorously for at least 30 minutes. 
The detergent was filtered before use in the assay through a 0.22pm filter (e.g. Nalgene top 
bottle filter). 

2. Japanese Detergent Solution 

Milli-Q water was adjusted to 3 gpg water hardness (Ca/Mg=3/1), add 0.66 g/l 
Detergent Composition III, the detergent solution was stirred vigorously for at least 30 
minutes. The detergent was filtered before use in the assay through a 0.22|jm filter (e.g. 
Nalgene top bottle filter). 

3. Cold Water Liquid Detergent (US Conditions): 

Milli-Q water was adjusted to 6 gpg water hardness (Ca/Mg=3/1), add 1.60 g/l 
TIDE® LVJ-1 detergent and stir the detergent solution vigorously for at least 15 minutes. 
Add 5m M Hepes buffer and set pH at 8.2. The detergent was filtered before use in the 
assay through a 0.22pm filter (e.g. Nalgene top bottle filter). 

4. Low pH Liquid Detergent (US Conditions): 

Milli-Q water was adjusted to 6 gpg water hardness (Ca/Mg=3/1), 1 .60 g/l Detergent 
Composition I, was added and the detergent solution stirred vigorously for at least 15 
minutes. The pH was set at 6.0 using 1N NaOH solution. The detergent was filtered before 
use in the assay through a 0.22pm filter (e.g. Nalgene top bottle filter). 

Micros watches: 

Microswatches of W circular diameter were ordered and delivered by CFT 
Vlaardingen. The microswatches were pretreated using the fixation method described 
below. Single microswatches were placed in each well of a 96-well microtiter plate vertically 
to expose the whole surface area (i.e., not flat on the bottom of the well). 

Bleach Fixation ("Superfixed"): 
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In a 10 L stainless steel beaker containing 10L of water, the water was heated to 
60°C for fixation of swatches used in European conditions (=Super fixed). For Japanese 
condition(s) and other conditions, the swatches were fixed at room temperature (=3K). 
Then, 10 ml of 30% hydrogen peroxide (1 ml/L of H 2 0 2 , final cone, of H 2 0 2 is 300 ppm) 
were added. Then, 100 swatches (10 swatches/L) were added to the solution. The solution 
was allowed to sit for 30 minutes with occasional stirring and monitoring of the temperature. 
The swatches were rinsed 7-8 times with cold water and placed on bench to dry. A towel 
was placed on top of swatches, as this prevents the swatches from curling up. For the 3K 
swatches, the procedure is repeated (except the water was not heated and10x the amount 
of hydrogen peroxide was added). 

Alternative Fixation ("3K" Swatch Fixation): 

This particular swatch fixation was done at room temperature, however the amount 
of 30% H202 added is 10X more than in the Superfixed Swatch Fixation. Bubble formation 
(frothing) will be visible and therefore it is necessary to use a bigger beaker to account for 
this. First, 8 liters of distilled water are placed in a 10 L beaker, and 80 ml of 30% hydrogen 
peroxide are added. The water and peroxide are mixed well with a ladle. Then, 40 pieces 
of EMPA 116 swatches were spread into a fan before adding into the solution to ensure 
uniform fixation. The swatches were swirled in the solution (using the ladle) for 30 minutes, 
continuously for the first five minutes and occasionally for the remaining 25 minutes. The 
solution was discarded and the swatches were rinsed 6 times with approximately 6 liters of 
distilled water each time. The swatches were placed on top of paper towels to dry. The air- 
dried swatches were punched using a %" circular die on an expulsion press. A single 
microswatch was placed vertically into each well of a 96-well microtiter plate to expose the 
whole surface area (i.e. not flat on the bottom of the well). 

Enzyme Samples: 

The enzyme samples were tested at appropriate concentrations for the respective 
geography, and diluted in 10 mM NaCI, 0.005% TWEEN®-80 solution. 

Test Method: 

The incubator was set at the desired temperature: 20°C for cold water liquid 
conditions; 30°C for low-pH liquid conditions; 40°C for European conditions; 20°C for 
Japanese and North American conditions. The pretreated and precut swatches were placed 
into the wells of a 96-well MTP, as described above. The enzyme samples were diluted, if 
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needed, in 10 mM NaCI, 0.005% TWEEN®-80 to 20x the desired concentration. The 
desired detergent solutions were prepared as described above. Then, 190 pi of detergent 
solution were added to each well of the MTP. To this mixture, 10 pi of enzyme solution were 
added to each well (to provide a total volume to 200 |jl/well). The MTP was sealed with a 
plate sealer and placed in an incubator for 60 minutes, with agitation at 350 rpm. Following 
incubation under the appropriate conditions, 100 \j\ of solution from each well were removed 
and placed into a fresh MTP. The new MTP containing 100 pi of solution/well was read at 
405 nm in a MTP reader. Blank controls, as well as a control containing a microswatch and 
detergent but no enzyme were also included. 



Table 1-1 Detergent Composition and Incubation Conditions in the pSwatch Assay. 



Geography 


Reference 
Enzyme 


Detergent 


Water 
Hardness 


Enzyme 
Dosage 
[ppm] 


Temp. 


Swatch 


European 


.ASP 
GG36 


7.6 g/l 
ARIEL® 
Regular 


15 gpg- 
Ca/Mg:4/1 


0.5-4 


40° 


Superfix 


Japanese 


ASP 
GG36 


0.66 g/l 
Detergent 
Comp. Ill 


3gpg- 

Ca/Mg:3/1 


0.5-4 


20° 


3K 


Cold Water 
Liquid 


ASP 


1.6 g/l Tide® 
LVJ-1 


6 gpg - Ca/Mg 
:3/1 


0.5-4 


20° 


3K 


Liquid 
Detergent 
Comp. I 


ASP 


1.6 g/l 
Detergent 
Comp. I 


6gpg- 
Ca/Mg:3/1 


0.5-4 


30° 


3K 



** The stock solution was used at a concentration of 15,000 gpg 
stock #1 = Ca/Mg 3:1 

(1 .92 M Ca 2+ = 282.3 g/L CaCI 2 .2H 2 0; 0.64 M Mg 2+ = 30.1 g/L MgCI 2 .6H 2 0) 
stock #2= Ca/Mg 4:1 

(2.05 M Ca 2+ = 301 .4 g/L CaCI 2 .2H 2 0; 0.51 M Mg 2+ =103.7 g/L MgCI 2 .6H 2 0) 

Calculation of the BMI Performance: 

The obtained absorbance value was corrected for the blank value (obtained after 
incubation of microswatches in the absence of enzyme). The resulting absorbance was a 
measure for the hydrolytic activity. For each sample (variant) the performance index was 
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calculated. The performance index compares the performance of the variant (actual value) 
and the standard enzyme (theoretical value) at the same protein concentration. In addition, 
the theoretical values can be calculated, using the parameters of the Langmuir equation of 
the standard enzyme. A performance index (PI) that is greater than 1 (Pl>1) identifies a 
better variant (as compared to the standard [e.g., wild-type]), while a PI of 1 (Pl=1) identifies 
a variant that performs the same as the standard, and a PI that is less than 1 (Pl<1) 
identifies a variant that performs worse than the standard. 

Thus, the PI identifies winners, as well as variants that are less desirable for use under 
certain circumstances. 



D, Dimethvlcasein Hydrolysis Assay (96 wejjs) 



In this assay system, the chemical and reagent solutions used were: 



Dimethylcasein (DMC): 

TWEEN®-80: 

PIPES buffer (free acid): 



Picrylsulfonic acid (TNBS): 
Reagent A: 



Reagent B: 



Sigma C-9801 
Sigma P-8074 

Sigma P-1851; 15.1 g is dissolved in about 960 ml water; pH is 
adjusted : to 7.0 with 4N NaOH, 1 ml 5% TWEEN®- 80 is 
added and the volume brought up to 1000 ml. The final 
concentration of PIPES and TWEEN®-80 is 50 mM and 
0.005% respectively. 
Sigma P-2297 (5% solution in water) 

45.4 g Na 2 B 4 O 7 .10 H20 (Merck 6308) and 15 ml of 4N NaOH 
are dissolved together to a final volume of 1000 ml (by 
heating if needed) 

35.2 g NaH 2 P0 4 .1H 2 0 (Merck 6346) and 0.6 g Na 2 S0 3 (Merck 
6657) are dissolved together to a final volume of 1000 ml. 



Method: 

To prepare the substrate, 4 g DMC were dissolved in 400 ml PIPES buffer. The filtered 
culture supernatants were diluted with PIPES buffer; the final concentration of the controls in 
the growth plate was 20 ppm. Then, 10 pi of each diluted supernatant were added to 200 pi 
substrate in the wells of a MTP. The MTP plate was covered with tape, shaken for a few 
seconds and placed in an oven at 37°C for 2 hours without agitation. 

About 15 minutes before removal of the 1 st plate from the oven, the TNBS reagent was 
prepared by mixing 1 ml TNBS solution per 50 ml of reagent A. MTPs were filled with 60 pi 
TNBS reagent A per well. The incubated plates were shaken for a few seconds, after which 
10 pi were transferred to the MTPs with TNBS reagent A. The plates were covered with 
tape and shaken for 20 minutes in a bench shaker (BMG Thermostar) at room temperature 
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and 500 rpm. Finally, 200 pi reagent B were added to the wells, mixed for 1 minute on a 
shaker, and the absorbance at 405 nm was determined using an MTP-reader. 

Calculation of Dimethylcasein Hydrolyzing Activity: 

The obtained absorbance value was corrected for the blank value (substrate without 
enzyme). The resulting absorbance is a measure for the hydrolytic activity. The (arbitrary) 
specific activity of a sample was calculated by dividing the absorbance and the determined 
protein concentration. 

E. Ther mosta b i I i tv Assay 

This assay is based on the dimethylcasein hydrolysis, before and after heating of the 
buffered culture supernatant. The same chemical and reagent solutions were used as 
described in the dimethylcasein hydrolysis assay. 

Method: 

The filtered culture supernatants were diluted to 20 ppm in PIPES buffer (based on 
the concentration of the controls in the growth plates). Then, 50 pi of each diluted 
supernatant were placed in the empty wells of a MTP. The MTP plate was incubated in an 
iEMS incubator/shaker HT (Thermo Labsystems) for 90 minutes at 60°C and 400 rpm. The 
plates were cooled on ice for 5 minutes. Then, 10 pi of the solution was added to a fresh 
MTP containing 200 pi dimethylcasein substrate/well. This MTP was covered with tape, 
shaken for a few seconds and placed in an oven at 37 °C for 2 hours without agitation. The 
same detection method as used for the DMC hydrolysis assay was used. 

Calculation of Thermostability: 

The residual activity of a sample was expressed as the ratio of the final absorbance 
and the initial absorbance, both corrected for blanks. 

F. LAS Stability Assay 

LAS stability was measured after incubation of the test protease in the presence of 
0.06% LAS (dodecylbenzenesulfonate sodium), and the residual activity was determined 
using the AAPF assay. 



WO 2005/052146 



PCTYUS2004/039066 



- 116 - 

Reagents: 

Dodecylbenzenesulfonate, Sodium salt (=LAS): Sigma D-2525 
TWEEN®-80: Sigma P-8074 

TRIS buffer (free acid): Sigma T-1378); 6.35 g is dissolved in about 960 ml water; pH is 

adjusted to 8.2 with 4N HCI. Final concentration of TRIS is 52.5 mM. 

LAS stock solution: Prepare a 10.5 % LAS solution in MQ water (=10.5 g per 100 ml 

MQ) 

TRIS buffer-100 mM / pH 8.6 (100mM Tris/0.005% Tween80) 
TRIS-Ca buffer, pH 8.6 (100mM Tris/10mM CaCI2/0.005% Tween80) 

Hardware: 

Flat bottom MTPs: Costar (#9017) 
Biomek FX 
ASYS Multipipettor 
Spectramax MTP Reader 
iEMS Incubator/Shaker 
Innova 4330 Incubator/Shaker 
Biohit multichannel pipette 
BMG Thermostar Shaker 



Method: 

A 10 pi 0.063% LAS solution was prepared in 52.5 mM Tris buffer pH 8.2. The 
AAPF working solution was prepared by adding 1 ml of 100 mg/ml AAPF stock solution (in 
DMSO) to 100 ml (100 mM) TRIS buffer, pH 8.6. To dilute the supernatants, flat-bottomed 
plates were filled with dilution buffer and an aliquot of the supernatant was added and 
mixed well. The dilution ratio depended on the concentration of the ASP-controls in the 
growth plates (AAPF activity). The desired protein concentration was 80 ppm. 

Ten pi of the diluted supernatant was added to 190 pi 0.063% LAS buffer/well. The 
MTP was covered with tape, shaken for a few seconds and placed in an incubator (Innova 
4230) at 25°C, for 60 minutes at 200 rpm agitation. The initial activity (f=1 0 minutes) was 
determined after 10 minutes of incubation by transferring 10 pi of the mixture in each well to 
a fresh MTP containing 190pl AAPF work solution. These solutions were mixed well and the 
AAPF activity was measured using a MTP Reader (20 readings in 5 minutes and 25°C). 

The final activity (f=60 minutes) was determined by removing another 10 pi of 
solution from the incubating plate after 60 minutes of incubation. The AAPF activity was 
then determined as described above. The calculations were performed as follows: 
the % Residual Activity was [f-60 value]*100 / [M0 value]. 



G. Scrambled Egg Hydrolysis Assay 

Proteases release insoluble particles from scrambled egg, which was baked into the 
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wells of 96-well microtiter plates. The scrambled egg coated wells were treated with a 
mixture of protease containing culture filtrate and ADW (automatic dishwash detergent) to 
determine the enzyme performance in scrambled egg removal. The rate of turbidity is a 
measure of the enzyme activity. 

5 

Materials: 

Water bath 

Oven with mechanical air circulation (Memmert ULE 400) 

Incubator/shaker with amplitude of 0.25 cm (Multitron), equipped with MTP-holders and 
10 aluminum covers and bottoms 

Biomek FX liquid-handling system (Beckman) 

Micro plate reader (Molecular Devices Spectramax 340, SOFTmax Pro Software) 
Nichiryo 8800 multi channel syringe dispenser + syringes 
Micro titer plate tape 
15 Single and multi channel pipettes with tips 
Grade A medium eggs 

CaCI 2 .2H 2 0 (Merck 102382); MgCI 2 .6H 2 0 (Merckl 05833); Na 2 C0 3 (Merck 6392) 
ADW product: ' 

LH-powder (= Light House) 



Procedure: 

25 Three eggs were stirred with a fork in a glass beaker and 100 ml milk (at 4°C or 

room temperature) was added. The beaker was placed in an 85°C water bath, and the 
mixture was stirred constantly with a spoon. As the mixture became thicker, care was taken 
to scrape the solidifying material continuously from the walls and bottom of the beaker. 
When the mixture was slightly runny (after about 25 minutes) the beaker was removed from 

30 the bath. Another 40 ml milk was added to the mixture and blended with a hand mixer or 
blender for 2 minutes. The mixture was cooled to room temperature (an ice bath can be 
used). The substrate was then stirred with an additional amount of 5 to 15% water (usually 
7.5%). 

35 Test Method: 

First, 50|jl of scrambled egg substrate were dispensed into each well of a MTP. The 
plates were allowed to dry at room temperature overnight (about 17 hours), baked in oven at 
80°C for 2 hours, then cooled to room temperature. 

ADW product solution was prepared by dissolving 2.85 g of LH-powder into 1 L 
40 water. Only about 15 minutes dissolution time was needed and filtration of the solution was 
not needed. Then, 1.16 mL artificial hardness solution was added and 2120 mg Na 2 C0 3 
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was dissolved in the solution. 

Hardness solution was prepared by mixing 188.57g CaCI 2 .2H 2 0 and 86.92g 
MgCI 2 .6H 2 0 in 1L demi water (equal to 1.28 MCa + 0.43 M Mg and totally 10000 gpg). The 
above-mentioned amounts of ADW, CaCI 2 and MgCI 2 were already proportionally increased 
values (200/1 90x) because of the addition of 10 pi supernatant to 190 pi ADW solution. 

ADW solution (190 jjI) was added to each well of the substrate plate. The MTPs 
were processed by addinglO pi of supernatant to each well and sealing the'plate with tape. 
The plate was placed in a pre-warmed incubator/shaker and secured with a metal cover and 
clamp. The plate was then washed for 30 minutes at the appropriate temperature (50°C for 
US) at 700 rpm. The plate was removed from the incubator/shaker. With gentle up and 
down movements of the liquid, about 125 pi of the warm supernatant were transferred to an 
empty flat bottom plate. After cooling, exactly 100 \s\ of the dispersion was dispensed into 
the wells of an empty flat bottom plate. The absorbance at 405 nm was determined using a 
microtiter plate reader. 

Calculation of the Scrambled Egg Hydrolyzing Activity: 

The obtained absorbance value was corrected for the blank value (substrate without 
enzyme). The resulting absorbance is a measure for the hydrolytic activity. For each 
sample (variant) the performance index was calculated. The performance index compares 
the performance of the variant (actual value) and the standard enzyme (theoretical value) at 
the same protein concentration. In addition, the theoretical values can be calculated, using 
the parameters of the Langmuir equation of the standard enzyme. A performance index (PI) 
that is greater than 1 (Pl>1) identifies a better variant (as compared to the standard [e.g., 
wild-type]), while a PI of 1 (Pl=1) identifies a variant that performs the same as the standard, 
and a PI that is less than 1 (Pl<1) identifies a variant that performs worse than the standard. 
Thus, the PI identifies winners, as well as variants that are less desirable for use under 
certain circumstances. 



EXAMPLE 2 

Production of 69B4 protease From the Gram-Positive Alkaliphilic Bacterium 69B4 

This Example provides a description of the Cellulomonas strain 69B4 used to initially 
isolate the novel protease 69B4 provided by the present invention. The alkaliphilic micro- 
organism Cellulomonas strain 69B.4, (DSM 16035) was isolated at 37°C on an alkaline 
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casein medium containing (g L" 1 ) (See e.g., Duckworth etal., FEMS Microbiol. EcoL, 19:181 
191 [1996]). 



Glucose (Merck 1.08342) 


10 


Peptone (Difco0118) 


5 


Yeast extract (Difco 0127) 


5 


K 2 HP0 4 


1 


MgS0 4 .7H 2 0 


0.2 


NaCI 


40 


Na 2 C0 3 


10 


Casein 


20 


Agar 


20 



An additional alkaline cultivation medium (Grant Alkaliphile Medium) was also used 
to cultivate Cellulomonas strain 69B.4, as provided below: 

Grant Alkaliphile Medium ("GAM") solution A (g L' 1 ) 
Glucose (Merck 1 .08342) 10 
Peptone (Difco 01 18) 5 
Yeast extract (Difco 0127) 5 
K 2 HP0 4 1 
MgS0 4 .7H 2 0 0.2 

Dissolved in 800 ml distilled water and sterilized by autoclaving 

GAM solution B (g L 1 ) 
NaCI 40 
Na 2 C0 3 10 

Dissolved in 200 ml distilled water and sterilized by autoclaving. 

Complete GAM medium was prepared by mixing Solution A (800 ml) with Solution B 
(200 ml). Solid medium is prepared by the addition of agar (2% w/v). 

Growth Conditions 

From a freshly thawed glycerol vial of culture (stored as a frozen glycerol (20% v/v, 
stock stored at -80°C), the micro-organisms were inoculated using an inoculation loop on 
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Grant Alkaliphile Medium (GAM) described above in agar plates and grown for at least 2 
days at 37 C. One colony was then used to inoculate a 500 ml shake flask containing 100 
ml of GAM at pH 10. This flask was then incubated at 37°C in a rotary shaker at 280 rpm for 
1-2 days until good growth (according to visual observation) was obtained. Then, 100 ml of 
broth culture was subsequently used to inoculate a 7 L fermentor containing 5 liters of GAM. 
The fermentations were run at 37°C for 2-3 days in order to obtain maximal production of 
protease. Fully aerobic conditions were maintained throughout by injecting air, at a rate of 5 
L/min, into the region of the impeller, which was rotating at about 500 rpm. The pH was set 
at pH 10 at the start, but was not controlled during the fermentation. 

Preparation of 69B4 Crude Enzyme Samples 

Culture broth was collected from the fermentor, and cells were removed by 
centrifugation for 30 min at 5000 x g at 10 9 C. The resulting supernatant was clarified by 
depth filtration over Seitz EKS (SeitzSchenk Filtersystems). The resulting sterile culture 
supernatant was further concentrated approximately 10 times by ultra filtration using an ultra 
filtration cassette with a 10kDa cut-off. (Pall Omega 10kDa Minisette; Pall). The resulting 
concentrated crude 69B4 samples were frozen and stored at -20°C until further use. 
Purification 

The cell separated culture broth was dialyzed against 20mM (2-(4-morpholino)- 
ethane sulfonic acid ("MES") ,pH 5.4, 1mM CaCI 2 using 8K Molecular Weight Cut Off 
(MWCO) Spectra-Por7 (Spectrum) dialysis tubing. The dialysis was performed overnight or 
until the conductivity of the sample was less than or equal to the conductivity of the MES 
buffer. The dialyzed enzyme sample was purified using a BioCad VISION(Applied 
Biosystems) with a 10x100mm(7.845 mL) POROS High Density Sulfo-propyl (HS) 20 
(20micron) cation-exchange column (PerSeptive Biosystems). After loading the enzyme on 
the previously equilibrated column at 5mL/min, the column was washed at 40mL/min with a 
pH gradient from 25mM MES, pH 6.2, 1mM CaCI 2 to 25mM (N-[2-hydroxyethyl] piperazine- 
N'-[2-ethane] sulfonic acid [CsHislWiS, CAS # 7365-45-9]) ("HEPES") pH 8.0,1 mM CaCI 2 
in 25 column volumes. Fractions (8mL) were collected across the run. The pH 8.0 wash 
step was held for 5 column volumes and then the enzyme was eluted using a gradient (0- 
100 mM NaCI in the same buffer in 35 column volumes). Protease activity in the fractions 
was monitored using the pNA assay (sAAPF-pNA assay; DelMar, etai, supra). Protease 
activity which eluted at 40mM NaCI was concentrated and buffer exchanged(using a 5K 
MWCO VIVA Science 20mL concentrator) into 20mM MES, pH 5.8, 1mMCaCI2. This 
material was used for further characterization of the enzyme. 
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EXAMPLE 3 

PCR Amplification of a Serine Protease Gene Fragment 

In this Example, PCR amplification of a serine protease gene fragment is described. 

5 

Degenerate Primer Design 

Based on alignments of published serine protease amino acid sequences, a range of 
degenerate primers were designed against conserved structural and catalytic regions. Such 
regions included those that were highly conserved among the serine proteases, as well as 

10. those known to be important for enzyme structure and function. 

During the development of the present invention, protein sequences of nine 
published serine proteases (Streptogrisin C homologues) were aligned, as shown in below. 
The sequences were Streptomyces griseus Streptogrisin C (accession no. P52320); alkaline 
serine protease precursor from Thermobifida fusca (accession no. AAC23545); alkaline 

15 proteinase (EG 3.4.21 .-) from Streptomyces sp. (accession no. PC2053); alkaline serine 
proteinase I from Streptomyces sp. (accession no. S34672); serine protease from 
Streptomyces lividans (accession no. CAD4208); putative serine protease from 
Streptomyces coelicolor A3(2) (accession no. NP_625129); putative serine protease from 
Streptomyces avermitilis MA-4680 (accession no. NP_822175); serine protease from 

20 Streptomyces lividans (accession no. CAD42809); putative serine protease precursor from 
Streptomyces coelicolor A3(2) (accession no. NP_628830). All of these sequences are 
publicly available from GenBank. These alignments are provided below. In this alignment, 
two conserved boxes are underlined and shown in bold. 

25 AAC23545 ( 1 ) - — MNHSSR — RTTSLLFTAALAATALVAATTPAS 

PC2053 (1 ) - -MRHTGR-NAIGAAIAASALAFALVPSQAAAN DTLTERAEAAV 

S34672 (1) — MRLKGRTVAI G S ALAAS ALALSLVPANAS S ELP SAETAKADALV 

CAD42808 (1) MVGRHAAR- SRRAALTALGALVLTALP S AASAAPP PVPGPRPAVARTPDA 

NP_625129 (1) MVGRHAAR- SRRAALTALGALVLTALP S AASAAPP PVPGPRPAVARTPDA 

30 NP_822175 (1) MVHRHVG--AGCAGLSVLATLVLTGLPAAAAIEPP-GPAPAPSAVQPLGA 

CAD42809 (1) MPHRHRHH - RAVGAAVAATAALLVAGL SGS AS AGTAPAG SAPTAAETLRT 

NP_628830 (1) MPHRHRHH - RAVGAAVAATAALLVAGLSGSAS AGTAPAG SAPTAAETLRT 

P52320 (1) MERTT - LRRRALVAGTATVAVGALALAGLTGVAS ADPAATAAPPVS A 

35 51 100 

AAC23545 (31) AQELALKRDLGLSDAEVAELRAAEAEAVELEEELRDSLGSDFGGV 

PC2053 (42) ADLPAGVLDAMERDLGLSEQEAGLKLVAEHDAALLGETLSADLDAFAGSW 

S3 4 672 (45) EQLPAGMVDAMERDLGVPAAEVGNQLVAEHEAAVLEESL SEDLSGYAG SW 

CAD42808 (50) ATAPARMLSAMERDIjRIiAPGQAAARPVNEAEAGTRAGMLRNTLGDRFAGA 

40 NP_625129 ( 50 ) ATAPARMLSAMERDLRLAPGQAAARLVNEAEAGTRAGMLRNTLGDRFAGA 

NP_822175 (48) GNPSTAVLGALQRDLHLTDTQAKTRLVNEMEAGTRAGRLQNALGKHFAGA 

CAD42809 (50) DAAPPALLKAMQRDLGI DRRQAERRLVNEAEAGATAGRLRAALGGDFAGA 

NP_628830 (50) DAAPPALLKAMQRDLGLDRRQAERRLVNEAEAGATAGRLRAALGGDFAGA 

P52320 (47) DSLSPGMLAALERDLGLDEDAARSRIANEYRAAAVAAGLEKSLGARYAGA 



45 



101 150 
AAC23545 (76) YLDAIH 1 - TEI TVAWDPAAVSRVDADDVTVDVVDFGETALNDFVASLNAI 
PC2053 (92) LAEGT ELWATTSEAEAAE I TEAGATAEWDHTLAELDS VKDALDTA 
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10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



70 



75 



S34672 
CAD42808 
NP_625129 
NP_822175 
CAD42809 
NP_628830 
P52320 



AAC23545 
PC2053 
S34672 
CAD42808 
NP_625129 
NP_822175 
CAD42809 
NP_628830 
P52320 



AAC23545 
PC2053 
S34672 
CAD42808 
NP_625129 
NP_822175 
CAD42809 
NP_628830 
P52320 



AAC23545 
PC2053 
S34672 
CAD42808 
NP_625129 
NP_822175 
CAD42809 
NP_628830 
1*52320 



AAC23545 
PC2053 
S34672 
CAD42808 
NP_625129 
NP_822175 
CAD42809 
NP__628830 
P52320 



AAC23545 
PC2053 
S34672 
CAD42808 
NP_625129 
NP_822175 
CAD42809 
NP_628830 
P52320 



AAC23545 
PC2053 
S34672 
CAD42808 
NP_625129 
NP_822175 
CAD42809 
NP_628830 
P52320 



AAC23545 
PC2053 



(95) IVEGTS — EHWATTDRAEAAE I T AAG AT ATWEH S LAELEAVKD I LDEA 

(100) WVS G AT S AELTVATTDAADT AA I EAQG AKAAVVGRNLAE LRAVKEKLD AA 

(100) WVSGATSAELTVATTDAADTAAI EAQG AKAA WGRNIiAE LRAVKEKLD AA 

(98) WVHGAASADLTVATTHATDI PAITAGGATAWVKTGLDDLKGAKKKLDSA 

(100) WVRGAE S GTLTVATTD AGDVAAVEARGAE AKVVRH SLADLDAAKARLDT A 

(100) WVRGAE SGTLTVATTDAGDVAAI E ARGAE AKWRH S LADLDAAKARLDTA 

(97) RVSGAK-ATLTVATTDASEAARITEAGARAEWGHSLDRFEGVKKSLDKA 

151 200 
(125) ADT — ADPKVTGWYTDLESDAWI TTLRGGTPAAEELAERAGLDERAVRI 
(139) AE S - YDTTDAPVWYVDVTTNGVVLLTSD — VTEAEGFVEAAGVNAAAVDI 
(143) ATA-NPEDAAPVWYVDVTTNEVWLASD — VPAAEAFVAASGADASTVRV 
(150) AVR-TRTRQTPVWYVDVKTNRVTVQATG — A S AAAAF VE AAGVPAADVGV 
(150) AVR -TRTRQT PVWYVD VKTNRVTVQ ATG — ASAAAAFVEAAGVPAADVGV 
(148) VAHGGTAVNT P VR YVDVRTNRVTLQ ARS — RAAADAL I AAAGVDSGLVDV 
(150) AAG-LNTADAPVWYVDTRTNTWVEAIR — PAAARSLLTAAGVDGSLAHV ' 
(150) AAG - LNTAD AP VWYVDTRTNTVVVE AI R — PAAARSLLTAAGVDGSLAHV 
(146) ALD -KAPKNVPVWYVDVAANRVWNAAS — PAAGQAFLKVAGVDRGLVTV 

201 250 

( 173 ) VEEDE EPQSLAAI IGGNPYYFGN- YRCSIGFSVRQGSQTGFATAGHCGST 

(186) QTSDEQPQAFYDLVGGDAYYMGG -GRCS VGF S VTQGSTPGFATAGHCGTV 

(190) ERSDESPQPFYDLVGGDAYYIGN-GRCSIGFSVRQGSTPGFVTAGHCGSV 

(197) RVSPDQPRVLEDLVGGDAYYI DDQ ARCS I GFSVTKDDQEGFATAGHCGDP 

(197) RVS PDQPRVLEDLVGGDAYYI DDQARC S IGFS VTKDDQEGFATAGHCGDP 

(196) KVSEDRPRALFDIRGGDAYYIDNTARCSVGFSVTKGNQQGFATAGHCGRA 

( 197 ) KNRTERPRTF YDLRGGE A Y YINNS SRCS I GFPITKGTQQGFATAGHCDRA 
(197) KNRTERPRTFYDLRGGEAYYINNS SRC S IGFPITKGTQQGFATAGHCGRA 
( 193 ) ARSAEQPRALADIRGGDAYYMNGSGRCSVGFSVTRGTQNGFATAGHCGRV 

251 * 300 

(222) GTRVS S P SGTVAG S YF PGRDMGWVRITSADTOTPL VNRYNGGTVTV 

(235) GTSTTG YNQAAQGTFEE S SFPGDDMAWVSVNSDWNTTPTVNE — GE-VTV 

(239) GNATTG FNRVS QGTFRG SWF PGRDMAWVAVN SNWT P TS LVRNS -GSGVRV 

(247) GATTTGYNEADQGTFQASTFPGKDMAWVGVNSDWTATPDVKAEGGEKIQL 

(247) GATTTGYNEADQGTFQASTF PGKDMAWVGVNSDWTATPDVKAEGGEKI QL 

(246) GAPTAGFNE VAQGTVQ A SVF PGHDMAWVGVN SDWTATPD VAG AAG QNVS I 

( 247 ) GS STTGANRVAQGTFQG S I FPGRDMAWVATNS SWTATP YVLGAGGQNVQV 
(247) GS STTGANRVAQGTFQGS I FPGRDMAWVATNS SWTATP YVLGAGGQNVQV 
( 243 ) GTTTNGVNQQAQGTFQGSTFPGRDIAWVATNANWTPRPLVNGYGRGDVTV 

301 350 
(268) TG SQEAATGS S VCRSGATTGWRCGTI QSKNQTVRYAEGTVTGLTRTTACA 
(282) SG STEAAVGAS I CRSGSTTGWHCGTI QQHNTS VTYPEGTITGVTRTSVCA 
(288) TGSTQATVGSSICRSGSTTGWRCGTIQQHNTSVTYPQGTITGVTRTSACA 
(297) AGSVEALVGASVCRSGSTTGWHCGTIQQHDTSVTYPEGTVDGLTGTTVCA 
(297) AG S VEALVGASVCRSG STTGWHCGTI QQHDTSVTYPEGTVDGLTETTVCA 

(296) AGS VQAI VGAAI CRSG STTGWHCGTVEEHDTS VTYEEGTVDGLTRTTVCA 

(297) TGSTAS PVGASVCRSG STTGWHCGTVTQLNTSVTYQEGT I S PVTRTTVCA 
(297) TG STAS PVGAS VCRSGSTTGWHCGTVTQLNTS VT YQEGTI S PVTRTTVCA 
(293 ) AGSTAS WGAS VCRSG STTGWHCGT I QQLNTSVTYPEGTI SGVTRTSVCA 

351 400 
(318) EGGDSGGPWLTGSQAQGVTSGGTGDCRSGGITFFQPINPLLSYFGLQLVT 
( 332 ) EPGDSGGSYISGSQAQGVTSGGSGNCTSGGTTYHQPINPLLSAYGLDLVT 
(338) QPGDSGGSFISGTQAQGOTSGGSGNCSIGGTTFHQPVNPILSQYGLTLVR 
( 347 ) EPOTSGGPFVSGVQAQGTTSGGSGDCTNGGTTFYQPVNPLLSDFGLTLKT 
(347) EPGDSGGPFVSGVQAQGTTSGGSGDCmGGTTFYQPVNPLLSDFGLTLKT 

(346) EPGDSGGS FVSGSQAQG VTSGG SGDCTRGGTTYYQPVNPI LSTYGLTLKT 

(347) EPGDSGGSFI SG SQAQGVTSGG SGDCRTGGGTFFQ P INALLQNYGLTLKT 
( 347 ) EPGDSGGSF I SG SQAQGVTSGG SGDCRTGGETFFQP INALLQNYGLTLKT 
(343 ) E PGDSGG SYISGSQAQG VTSGG SGNCSSGGTTYFOPINPLLOAYGLTLVT 

401 450 

(368) G 

(382) G 

(388) S 

(397) T S AATQTPAPQDNAAA DAWTAGRVYEVGTTVS YDGVRYRCLQS H 

(397) TSAATQTPAPQDNAAA DAWTAGRVYE VGTTV S YDG VRYRCLQ S H 

(396) STAPTDTPSDPVDQSG VWAAGR VYE VG AQ VT YAGVT Y Q CLQ S H 

(397) TGGDDGGGDDGG EEPGG-TWAAGTVYQPGDTVTYGGATFRCLQGH 

(397) TGGDDGGGDDGGGDDGGE EPGG -TWAAGTVYQPGDTVTYGGATFRCLQGH . 
( 393 ) SGGGTPTDPPTTPPTDSP GGTWAVGTAYAAGATVTYGGATYRCLQAH 

451 468 

(369) (SEQ ID NO: 648) 

(383) ; (SEQ ID NO:649) 
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S34672 (389) 

CAD42808 (441) 

NP_625129 (441) 

NP_822175 (439) 

CAD42809 (441) 

NP_628830 (446) 

P52320 (440) 



QAQGVGS PAS VPALWQRV 
QAQGVGSPASVPALWQRV 
QAQ GVWQ P AAT P ALWQRL 
QAYAGWE PPNVPALWQRV 
QAYAGWEPPNVPALWQRV 
TAQPGWTPADVPALWQRV 
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(SEQ ID NO: 650) 
(SEQ ID NO: 651) 
(SEQ ID NO: 652) 
(SEQ ID NO: 653) 
(SEQ ID NO:654) 
(SEQ ID NO: 655) 
(SEQ ID NO: 656) 



Two particular regions were chosen to meet the criteria above, and a forward and a 
reverse primer were designed based on these amino acid regions. The specific amino acid 
regions used to design the primers are highlighted in black in the sequences shown in the 
alignments directly above. Using the genetic code for codon usage, degenerate nucleotide 
PCR primers were synthesized by MWG-Biotech. The degenerate primer sequences 
produced were: 

forward primer TTGWXCGT^FW: 5* ACNACSGGSTGGCRGTGCGGCAC 3' (SEQ ID 
NO: 10) 

reverse primer GDSGGX_RV: S'-ANGNGCCGCCGGAGTCNCC-S' (SEQ ID NO:1 1) 

As all primers were synthesized in the 5'-3' direction and standard IUB code for 
mixed base sites was used (e.g., to designate "N" for A/C/T/G). Degenerate primers 
TTGWXCGT_FW and GDSGGX_RV successfully amplified a 177 bp region from 
Cellulomonas sp. isolate 69B4 by PCR, as described below. 

PCR Amplification of a Serine Protease Gene Fragment 

Cellulomonas sp. isolate 69B4 genomic DNA was used as a template for PCR 
amplification of putative serine protease gene fragments using the above-described primers. 
PCR was carried out using High Fidelity Platinum Taq polymerase (Catalog number 11 304- 
102; Invitrogen). Conditions were determined by individual experiments, but typically thirty 
cycles were run in a thermal cycler (MJ Research). Successful amplification was verified by 
electrophoresis of the PCR reaction on a 1% agarose TBE gel. A PCR product that was 
amplified from Cellulomonas sp. 69B4 with the primers TTGWXCGT_FW and 
GDSGGX_RV was purified by gel extraction using the Qiaquick Spin Gel Extraction kit 
(Catalogue 28704; Qiagen) according to the manufacturer's instructions. The purified PCR 
product was cloned into the commercially available pCR2.1TOPO vector System 
(Invitrogen) according to the manufacturer's instructions, and transformed into competent 
E.coli TOP 10 cells. Colonies containing recombinant plasmids were visualized using 
blue/white selection. For rapid screening of recombinant transformants, plasmid DNA was 
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prepared from cultures of putative positive {i.e., white) colonies. DNA was isolated using 
the Qiagen plasmid purification kit, and was sequenced by Baseclear. One of the clones 
contained a DNA insert of 177 bp that showed some homology with several streptogrisin-like 
protease genes of various Streptomyces species and also with serine protease genes from 
other bacterial species. The DNA and protein coding sequence of this 177 bp fragment is 
provided in Fig. 13. 

Sequence Analysis 

The sequences were analyzed by BLAST and other protein translation sequence 
tools. BLAST comparison at the nucleotide level showed various levels of identity to 
published serine protease sequences. Initially, nucleotide sequences were submitted to 
BLAST (Basic BLAST version 2.0). The program chosen was "BlastX", and the database 
chosen was "nr." Standard/default parameter values were employed. Sequence data for 
putative Cellulomonas 69B4 protease gene fragment was entered in FASTA format and the 
query submitted to BLAST to compare the sequences of the present invention to those 
already in the database. The results returned for the 177 bp fragment a high number of hits 
for protease genes from various Streptomyces spp., including S. griseus, S. lividans, S. 
coelicolor, S. albogriseolus, S. platensis, S. fradiae, and Streptomyces sp. It was concluded 
that further analysis of the 177 bp fragment cloned from Cellulomonas sp. isolate 69B4 was 
desired. 



EXAMPLE 4 

Isolation of a Polynucleotide Sequence from the Genome 
of Cellulomonas 69B4 Encoding a Serine Protease by Inverse PCR 

In this Example, experiments conducted to isolate a polynucleotide sequence 
encoding a serine protease produced by Cellulomonas sp. 69B4 are described. 

Inverse PCR of Cellulomonas sp. 69B4 Genomic DNA to Isolate the Gene Encoding 
Cellulomonas strain 69B4 Protease 

Inverse PCR was used to isolate and clone the full-length serine protease gene from 
Cellulomonas sp. 69B4. Based on the DNA sequence of the 177 bp fragment of the 
Cellulomonas protease gene described in Example 3, novel DNA primers were designed: 
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69B4int_RV1 5'-CGGGGTAGGTGACCGAGGAGTTGAGCGCAGTG-3' (SEQIDNO:14) 
69B4int_FW2 5'-GCTCGCCGGCAACCAGGCCCAGGGCGTCACGTC-3' (SEQ ID NO:15) 

Chromosomal DNA of Cellulomonas sp. 69B4 was digested with the restriction 
enzymes Apa\, BamH\, BssHW, Kpn\ t Nar\, Ncol, Nhel, Pvu\, Sail or Ssfll, purified using the 
Qiagen PCR purification kit (Qiagen, Catalogue # 28106) and self-ligated with T4 DNA 
ligase (Invitrogen) according to the manufacturers' instructions. Ligation mixtures were 
purified using the Qiagen PCR purification kit, and PCR was performed with primers 
69B4int_RV1 and 69B4int_FW2. PCR on DNA fragments that were digested with Atool, and 
then self-ligated, resulting in a PCR product of approximately 1.3 kb. DNA sequence 
analysis (BaseClear) revealed that this DNA fragment covers the main part of a 
streptogrisin-Iike protease gene from Cellulomonas. This protease was designated as 
"69B4 protease," and the gene encoding Cellulomonas 69B4 protease was designated as 
the "asp gene." The entire sequence of the asp gene was derived by additional inverse 
PCR reactions with primer 69B40int_FW2 and an another primer: 69B4-for4 (5* AAC GGC 
GGG TTC ATC ACC GCC GGC CAC TGC GGC C 3' {SEQ ID NO:16). Inverse PCR with 
these primers on Nco\, BssHll, Apa\ and PviA digested and self-ligated DNA fragments of 
genomic DNA of Cellulomonas sp. 69B4 resulted in the identification of the entire sequence 
of the asp gene. 

Nucleotide and Amino Acid Sequences 

For convenience, various sequences are included below. First, the DNA sequence 
of the asp gene (SEQ ID NO:1) provided below encodes the signal peptide (SEQ ID NO:9) 
and the precursor serine protease (SEQ ID NO:7) derived from Cellulomonas strain 69B4 
(DSM 16035). The initiating polynucleotide encoding the signal peptide of the Cellulomonas 
strain 69B4 protease is in bold (ATG). 



1 


GCGCGCTGCG 


CCCACGACGA 


CGCCGTCCGC 


CGTTCGCCGG 


CGTACCTGCG 


TTGGCTCACC 




CGCGCGACGC 


GGGTGCTGCT 


GCGGCAGGCG 


GCAAGCGGCC 


GCATGGACGC 


AACCGAGTGG 


61 


ACCCACCAGA 


TCGACCTCCA 


TAACGAGGCC 


GTATGACCAG 


AAAGGGATCT 


GCCACCGCCC 




TGGGTGGTCT 


AGCTGGAGGT 


ATTGCTCCGG 


CATACTGGTC 


TTTCCCTAGA 


CGGTGGCGGG 


121 


ACCAGCACGC 


TCCTAACCTC 


CGAGCACCGG 


CGACCGCCGG 


GTGCGATGAA 


AGGGACGAAC 




TGGTCGTGCG 


AGGATTGGAG 


GCTCGTGGCC 


GCTGGCGGCC 


CACGCTACTT 


TCCCTGCTTG 


181 


CGAGATGACA 


CCACGCACAG 


TCACGCGGGC 


CCTGGCCGTG 


GCCACCGCAG 


CCGCCACACT 




GCTCTACTGT 


GGTGCGTGTC 


AGTGCGCCCG 


GGACCGGCAC 


CGGTGGCGTC 


GGCGGTGTGA 


241 


CCTGGCAGGC 


GGCATGGCCG 


CCCAGGCCAA 


CGAGCCCGCA 


CCACCCGGGA 


GCGCGAGCGC 




GGACCGTCCG 


CCGTACCGGC 


GGGTCCGGTT 


GCTCGGGCGT 


GGTGGGCCCT 


CGCGCTCGCG 


301 


ACCGCCACGC 


CTGGCCGAGA 


AGCTCGACCC 


CGACCTCCTC 


GAGGCCATGG 


AGCGCGACCT 




TGGCGGTGCG 


GACCGGCTCT 


TCGAGCTGGG 


GCTGGAGGAG 


CTCCGGTACC 


TCGCGCTGGA 


361 


GGGCCTCGAC 


GCGGAGGAAG 


CCGCCGCCAC 


CCTGGCGTTC 


CAGCACGACG 


CAGCCGAGAC 




CCCGGAGCTG 


CGCCTCCTTC 


GGCGGCGGTG 


GGACCGCAAG 


GTCGTGCTGC 


GTCGGCTCTG 
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421 


CGGCGAGGCC 


CTCGCCGAAG 


AGCTCGACGA GGACTTCGCC 


GGCACCTGGG 


TCGAGGACGA 






GCCGCTCCGG 


GAGCGGCTTC 


TCGAGCTGCT 


CCTGAAGCGG 


CCGTGGACCC 


AGCTCCTGCT 




481 


CGTCCTGTAC 


GTCGCCACCA 


CCGACGAGGA 


CGCCGTCGAG 


GAGGTCGAGG 


GCGAAGGCGC 






GCAGGACATG 


CAGCGGTGGT 


GGCTGCTCCT 


GCGGCAGCTC 


CTCCAGCTCC 


CGCTTCCGCG 


5 


541 


CACGGCCGTC 


ACCGTCGAGC 


ACTCCCTGGC 


CGACCTCGAG 


GCCTGGAAGA 


CCGTCCTCGA 






GTGCCGGCAG 


TGGCAGCTCG 


TGAGGGACCG 


GCTGGAGCTC 


CGGACCTTCT 


GGCAGGAGCT 




601 


CGCCGCCCTC 


GAGGGCCACG 


ACGACGTGCC 


CACCTGGTAC 


GTCGACGTCC 


CGACCAACAG 






GCGGCGGGAG 


CTCCCGGTGC 


TGCTGCACGG 


GTGGACCATG 


CAGCTGCAGG 


GCTGGTTCTC 




661 


CGTCGTCGTC 


GCCGTCAAGG 


CCGGAGCCCA GGACGTCGCC 


GCCGGCCTCG 


TCGAAGGTGC 


10 




GCAGCAGCAG 


CGGCAGTTCC 


GGCCTCGGGT 


CCTGCAGCGG 


CGGCCGGAGC 


AGCTTCCACG 




721 


CGACGTCCCG 


TCCGACGCCG 


TGACCTTCGT 


CGAGACCGAC 


GAGACCCCGC 


GGACCATGTT 






GCTGCAGGGC 


AGGCTGCGGC 


ACTGGAAGCA 


GCTCTGGCTG 


CTCTGGGGCG 


CCTGGTACAA 




781 


CGACGTGATC 


GGCGGCAACG 


CCTACACCAT 


CGGGGGGCGC 


AGCCGCTGCT 


CGATCGGGTT 






GCTGCACTAG 


CCGCCGTTGC 


GGATGTGGTA GCCCCCCGCG 


TCGGCGACGA 


GCTAGCCCAA 


15 


841 


CGCGGTCAAC 


GGCGGGTTCA 


TCACCGCCGG 


CCACTGCGGC 


CGCACCGGCG 


CCACCACCGC 






GCGCCAGTTG 


CCGCCCAAGT 


AGTGGCGGCC 


GGTGACGCCG 


GCGTGGCCGC 


GGTGGTGGCG 




901 


CAACCCCACC 


GGGACCTTCG 


CCGGGTCCAG 


CTTCCCGGGC 


AACGACTACG 


CGTTCGTCCG 






GTTGGGGTGG 


CCCTGGAAGC 


GGCCCAGGTC 


GAAGGGCCCG 


TTGCTGATGC 


GCAAGCAGGC 




961 


TACCGGGGCC 


GGCGTGAACC 


TGCTGG CCCA GGTCAACAAC 


TACTCCGGTG 


GCCGCGTCCA 


20 




ATGGCCCCGG 


CCGCACTTGG 


ACGACCGGGT 


CCAGTTGTTG 


ATGAGGCCAC 


CGGCGCAGGT 




1021 


GGTCGCCGGG 


CACACCGCGG 


CCCCCGTCGG 


CTCGGCCGTG 


TGCCGGTCCG 


GGTCGACCAC 






CCAGCGGCCC 


GTGTGGCGCC 


GGGGGCAGCC 


GAGCCGGCAC 


ACGGCCAGGC 


CCAGCTGGTG 




1081 


CGGGTGGCAC 


TGCGGCACCA 


TCACTGCGCT 


CAACTCCTCG 


GTCACCTACC 


CCGAGGGCAC 






GCCCACCGTG 


ACGCCGTGGT 


AGTGACGCGA 


GTTGAGGAGC 


CAGTGGATGG 


GGCTCCCGTG 


25 


1141 


CGTCCGCGGC 


CTGATCCGCA 


CCACCGTCTG 


CGCCGAGCCC 


GGCGACTCCG 


GTGGCTCGCT 






GCAGGCGCCG 


GACTAGGCGT 


GGTGGCAGAC 


GCGGCTCGGG 


CCGCTGAGGC 


CACCGAGCGA 




1201 


GCTCGCCGGC 


AACCAGGCCC 


AGGGCGTCAC 


GTCCGGCGGC 


TCCGGCAACT 


GCCGCACCGG 






CGAGCGGCCG 


TTGGTCCGGG 


TCCCGCAGTG 


CAGGCCGCCG 


AGGCCGTTGA 


CGGCGTGGCC 




1261 


TGGCACCACG 


TTCTTCCAGC 


CGGTCAACCC 


CATCCTCCAG 


GCGTACGGCC 


TGAGGATGAT 


30 




ACCGTGGTGC 


AAGAAGGTCG 


GCCAGTTGGG 


GTAGGAGGTC 


CGCATGCCGG 


ACTCCTACTA 




1321 


CACCACGGAC 


TCGGGCAGCA 


GCCCGGCCCC 


TGCACCGACC 


TCCTGCACCG 


GCTACGCCCG 






GTGGTGCCTG 


AGCCCGTCGT 


CGGGCCGGGG 


ACGTGGCTGG 


AGGACGTGGC 


CGATGCGGGC 




1381 


CACCTTCACC 


GGGACCCTCG 


CGGCCGGCCG 


GGCCGCCGCC 


CAGCCCAACG 


GGTCCTACGT 






GTGGAAGTGG 


CCCTGGGAGC 


GCCGGCCGGC 


CCGGCGGCGG 


GTCGGGTTGC 


CCAGGATGCA 


35 


1441 


GCAGGTCAAC 


CGGTCCGGGA 


CCCACAGCGT 


GTGCCTCAAC 


GGGCCCTCCG 


GTGCGGACTT 






CGTCCAGTTG 


GCCAGGCCCT 


GGGTGTCGCA 


CACGGAGTTG 


CCCGGGAGGC 


CACGCCTGAA 




1501 


CGACCTCTAC 


GTGCAGCGCT 


GGAACGGCAG 


CTCCTGGGTG 


ACCGTCGCCC 


AGAGCACCTC 






GCTGGAGATG 


CACGTCGCGA 


CCTTGCCGTC 


GAGGACCCAC 


TGGCAGCGGG 


TCTCGTGGAG 




1561 


CCCCGGCTCC AACGAGACCA 


TCACCTACCG 


CGGCAACGCC 


GGCTACTACC 


GCTACGTGGT 


40 




GGGGCCGAGG 


TTGCTCTGGT 


AGTGGATGGC 


GCCGTTGCGG 


CCGATGATCG 


CGATGCACCA 




1621 


CAACGCCGCG 


TCCGGCTCCG 


GTGCCTACAC 


CATGGGGCTC 


ACCCTCCCCT 


GACGTAGCGC 






GTTGCGGCGC 


AGGCCGAGGC 


CACGGATGTG 


GTACCCCGAG 


TGGGAGGGGA 


CTGCATCGCG (SEQ ID N0:1) 



45 The following DNA sequence (SEQ ID NO:2) encodes the signal peptide (SEQ ID 

NO:9) that is operatively linked to the precursor protease (SEQ ID NO:7) derived from 
Cellulomonas strain 69B4 (DSM 16035). The initiating polynucleotide encoding the signal 
peptide of the Cellulomonas strain 69B4 protease is in bold (ATG). The asterisk indicates 
the termination codon (TGA), beginning with residue 1486. Residues 85, 595, and 1162, 

so relate to the initial residues of the N terminal prosequence, mature sequence and Carboxyl 
terminal prosequence, respectively, are bolded and underlined. 
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10 



15 



20 



25 



30 



35 



40 



45 



50 



61 
121 
181 
241 
301 
361 
421 
481 

541 
601 
661 
721 
781 
841 
0901 
0961 
1021 
1081 

1141 
1201 
1261 
1321 
1381 

1441 



ATGACACCAC 
TACTGTGGTG 

GCAGGCGGCA 
CGTCCGCCGT 
CCACGCCTGG 
GGTGCGGACC 
CTCGACGCGG 
GAGCTGCGCC 
GAGGCCCTCG 
CTCCGGGAGC 
CTGTACGTCG 
GACATGCAGC 
GCCGTCACCG 
CGGCAGTGGC 
GCCCTCGAGG 
CGGGAGCTCC 
GTCGTCGCCG 
CAGCAGCGGC 

GTCCCGTCCG 
CAGGGCAGGC 
GTGATCGGCG 
CACTAGCCGC 
GTCAACGGCG 
CAGTTG CCGC 
CCCACCGGGA 
GGGTGGCCCT 
GGGGCCGGCG 
CCCCGGCCGC 
GCCGGGCACA 
CGGCCCGTGT 
TGGCACTGCG 
ACCGTGACGC 
CGCGGCCTGA 
GCGCCGGACT 
GCCGGCAACC 
CGGCCGTTGG 
ACCACGTTCT 
TGGTGCAAGA 

ACGGACTCGG 
TGCCTGAGCC 
TTCACCGGGA 
AAGTGGCCCT 
GTCAACCGGT 
CAGTTGGCCA 
CTCTACGTGC 
GAGATGCACG 
GGCTCCAACG 
CCGAGGTTGC 

GCCGCGTCCG 
CGGCGCAGGC 



GCACAGTCAC 
CGTGTCAGTG 

TGGCCGCCCA 
ACCGGCGGGT 
CCGAGAAGCT 
GGCTCTTCGA 
AGGAAGCCGC 
TCCTTCGGCG 
CCGAAGAGCT 
GGCTTCTCGA 
CCACCACCGA 
GGTGGTGGCT 
TCGAGCACTC 
AGCTCGTGAG 
. GCCACGACGA 
CGGTGCTGCT 
TCAAGGCCGG 
AGTTCCGGCC 



GCGGGCCCTG 
CGCCCGGGAC 

.85 
GGCCAACGAG 
CCGGTTGCTC 
CGACCCCGAC 
GCTGGGGCTG 
CGCCACCCTG 
GCGGTGGGAC 
CGACGAGGAC 
GCTGCTCCTG 
CGAGGACGCC 
GCTCCTGCGG 
CCTGGCCGAC 
GGACCGGCTG 
CGTGCCCACC 
GCACGGGTGG 
AGCCCAGGAC 
TCGGGTCCTG 



GCCGTGGCCA CCGCAGCCGC CACACTCCTG 
CGGCACCGGT GGCGTCGGCG GTGTGAGGAC 



ACGCCGTGAC CTTCGTCGAG 
TGCGGCACTG GAAGCAGCTC 
GCAACGCCTA CACCATCGGG 
CGTTGCGGAT GTGGTAGCCC 
GGTTCATCAC CGCCGGCCAC 
CCAAGTAGTG GCGGCCGGTG 
CCTTCGCCGG GTCCAGCTTC 
GGAAGCGGCC CAGGTCGAAG 
TGAACCTGCT GGCCCAGGTC 
ACTTGGACGA CCGGGTCCAG 
CCGCGGCCCC CGTCGGCTCG 
GGCGCCGGGG GCAGCCGAGC 
GCACCATCAC TGCGCTCAAC 
CGTGGTAGTG ACGCGAGTTG 
TCCGCACCAC CGTCTGCGCC 
AGGCGTGGTG GCAGACGCGG 
AGGCCCAGGG CGTCACGTCC 
TCCGGGTCCC GCAGTGCAGG 
TCCAGCCGGT CAACCCCATC 
AGGTCGGCCA GTTGGGGTAG 
1162 

GCAGCAGCCC GGCCCCTGCA 
CGTCGTCGGG CCGGGGACGT 
CCCTCGCGGC CGGCCGGGCC 
GGGAGCGCCG GCCGGCCCGG 
CCGGGACCCA CAGCGTGTGC 
GGCCCTGGGT GTCGCACACG 
AGCGCTGGAA CGGCAGCTCC 
TCGCGACCTT GCCGTCGAGG 
AGACCATCAC CTACCGCGGC 
TCTGGTAGTG GATGGCGCCG 

GCTCCGGTGC CTACACCATG 
CGAGGCCACG GATGTGGTAC 



CCCGCACCAC 
GGGCGTGGTG 
CTCCTCGAGG 
GAGGAGCTCC 
GCGTTCCAGC 
CGCAAGGTCG 
TTCGCCGGCA 
AAG CGGCCGT 
GTCGAGGAGG 
CAGCTCCTCC 
CTCGAGGCCT 
GAGCTCCGGA 
TGGTACGTCG 
ACCATGCAGC 
GTCGCCGCCG 
CAGCGGCGGC 

ACCGACGAGA 
TGGCTGCTCT 
GGGCGCAGCC 
CCCGCGTCGG 
TGCGGCCGCA 
ACG CCGGCGT 
CCGGGCAACG 
GGCCCGTTGC 
AACAACTACT 
TTGTTGATGA 
GCCGTGTGCC 
CGGCACACGG 
TCCTCGGTCA 
AGGAGCCAGT 
GAGCCCGGCG 
CTCGGGCCGC 
GGCGGCTCCG 
CCGCCGAGGC 
CTCCAGGCGT 
GAGGTCCGCA 

CCGACCTCCT 
GGCTGGAGGA 
GCCGCCCAGC 
CGGCGGGTCG 
CTCAACGGGC 
GAGTTGCCCG 
TGGGTGACCG 
ACCCACTGGC 
AACGCCGGCT 
TTGCGGCCGA 

GGGCTCACCC 
CCCGAGTGGG 



CCGGGAGCGC 
GGCCCTCGCG 
CCATGGAGCG 
GGTACCTCGC 
ACGACGCAGC 
TGCTGCGTCG 
CCTGGGTCGA 
GGACCCAGCT 
TCGAGGGCGA 
AGCTCCCGCT 
GGAAGACCGT 
CCTTCTGGCA 
ACGTCCCGAC 
TGCAGGGCTG 
GCCTCGTCGA 
CGGAGCAGCT 

CCCCGCGGAC 
GGGGCGCCTG 
GCTGCTCGAT 
CGACGAGCTA 
CCGGCGCCAC 
GGCCGCGGTG 
ACTACGCGTT 
TGATGCGCAA 
CCGGTGGCCG 
GGCCACCGGC 
GGTCCGGGTC 
CCAGGCCCAG 
CCTACCCCGA 
GGATGGGGCT 
ACTCCGGTGG 
TGAGGCCACC 
GCAACTGGCG 
CGTTGACGGC 
ACGGCCTGAG 
TGCCGGACTC 

GCACCGGCTA 
CGTGGCCGAT 
CCAACGGGTC 
GGTTGCCCAG 
CCTCCGGTGC 
GGAGGCCACG 
TCGCCCAGAG 
AGCGGGTCTC 
ACTACCGCTA 
TGATGGCGAT 

1486* 
TCCCCTGA 
AGGGGACT 



GAGCGCACCG 
CTCGCGTGGC 
CGACCTGGGC 
GCTGGACCCG 
CGAGACCGGC 
GCTCTGGCCG 
GGACGACGTC 
CCTGCTGCAG 
AGGCGCCACG 
TCCGCGGTGC 
CCTCGACGCC 
GGAGCTGCGG 
CAACAGCGTC 
GTTGTCGCAG 
AGGTGCCGAC 
TCCACGGCTG 

595 
CATGTTCGAC 
GTACAAGCTG 
CGGGTTCGCG 
GCCCAAGCGC 
CACCGCCAAC 
GTGGCGGTTG 
CGTCCGTACC 
GCAGGCATGG 
CGTCGAGGTC 
GCAGGTCCAG 
GACCACCGGG 
CTGGTGGCCC 
GGGCACCGTC 
CCCGTGGCAG 
CTCGCTGCTC 
GAGCGACGAG 
CACCGGTGGC 
GTGGCCACCG 
GATGATCACC 
CTACTAGTGG 

CGCCCGCACC 
GCGGGCGTGG 
CTACGTGCAG 
GATGCACGTC 
GGACTTCGAC 
CCTGAAGCTG 
CACCTCCCCC 
GTGGAGGGGG 
CGTGGTCAAC 
GCACCAGTTG 

(SEQ ID NO: 2) 



55 
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The following DNA sequence (SEQ ID NO:3) encodes the precursor protease 
derived from Cellulomonas strain 69B4 (DSM 16035). 



10 



15 
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1 
61 
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481 
541 
601 
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901 
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1201 
1261 
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1381 



50 



AACGAGCCCG 
TTGCTCGGGC 
CCCGACCTCC 
GGGCTGGAGG 
ACCCTGGCGT 
TGGGACCGCA 
GAGGACTTCG 
CTCCTGAAGC 
GACGCCGTCG 
CTGCGGCAGC 
GCCGACCTCG 
CGGCTGGAGC 
CCCACCTGGT 
GGGTGGACCA 
CAGGACGTCG 
GTCCTGCAGC 
GTCGAGACCG 
CAGCTCTGGC 
ATCGGGGGGC 
TAGCCCCCCG 
GGCCACTGCG 
CCGGTGACGC 
AGCTTCCCGG 
TCGAAGGGCC 
CAGGTCAACA 
GTCCAGTTGT 
GGCTCGGCCG 
CCGAGCCGGC 
CTCAACTCCT 
GAGTTGAGGA 
TGCGCCGAGC 
ACGCGGCTCG 
ACGTCCGGCG 
TGCAGGCCGC 
CCCATCCTCC 
GGGTAGGAGG 
CCTGCACCGA 
GGACGTGGCT 
CGGGCCGCCG 
GCCCGGCGGC 
GTGTGCCTCA 
CACACGGAGT 
AGCTCCTGGG 
TCGAGGACCC 
CGCGGCAACG 
GCGCCGTTGC 
ACCATGGGGC 
TGGTACCCCG 



CACCACCCGG 
GTGGTGGGCC 
TCGAGGCCAT 
AGCTCCGGTA 
TCCAGCACGA 
AGGTCGTGCT 
CCGGCACCTG 
GGCCGTGGAC 
AGGAGGTCGA 
TCCTCCAGCT 
AGGCCTGGAA 
TCCGGACCTT 
ACGTCGACGT 
TGCAGCTGCA 
CCGCCGGCCT 
GGCGGCCGGA 
ACGAGACCCC 
TGCTCTGGGG 
GCAGCCGCTG 
CGTCGGCGAC 
GCCGCACCGG 
CGGCGTGGCC 
GCAACGACTA 
CGTTGCTGAT 
ACTACTCCGG 
TGATGAGGCC 
TGTGCCGGTC 
ACACGGCCAG 
CGGTCACCTA 
GCCAGTGGAT 
CCGGCGACTC 
GGCCGCTGAG 
GCTCCGGCAA 
CGAGGCCGTT 
AGGCGTACGG 
TCCGCATGCC 
CCTCCTGCAC 
GGAGGACGTG 
CCCAGCCCAA 
GGGTCGGGTT 
ACGGGCCCTC 
TGCCCGGGAG 
TGACCGTCGC 
ACTGGCAGCG 
CCGGCTACTA 
GGCCGATGAT 
TCACCCTCCC 
AGTGGGAGGG 



GAGCGCGAGC 
CTCGCGCTCG 
GGAGCGCGAC 
CCTCGCGCTG 
CGCAGCCGAG 
GCGTCGGCTC 
GGTCGAGGAC 
CCAGCTCCTG 
GGGCGAAGGC 
CCCGCTTCCG 
GACCGTCCTC 
CTGGCAGGAG 
CCCGACCAAC 
GGGCTGGTTG 
CGTCGAAGGT 
GCAGCTTCCA 
GCGGACCATG 
CGCCTGGTAC 
CTCGATCGGG 
GAGCTAGCCC 
CGCCACCACC 

CGCGTTCGTC 
GCGCAAGCAG 
TGGCCGCGTC 
ACCGGCGCAG 
CGGGTCGACC 
GCCCAGCTGG 
CCCCGAGGGC 
GGGGCTCCCG 
CGGTGGCTCG 
GCCACCGAGC 
CTGCCGCACC 
GACGGCGTGG 
CCTGAGGATG 
GGACTCCTAC 
CGGCTACGCC 
GCCGATGCGG 
CGGGTCCTAC 
GCCCAGGATG 
CGGTGCGGAC 
GCCACGCCTG 
CCAGAGCACC 
GGTCTCGTGG 
CCGCTACGTG 
GGCGATGCAC 
CTGA (SEQ 
GACT 



GCACCGCCAC 
CGTGGCGGTG 
CTGGGCCTCG 
GACCCGGAGC 
ACCGGCGAGG 
TGGCCGCTCC 
GACGTCCTGT 
CTGCAGGACA 
GCCACGGCCG 
CGGTGCCGGC 
GACGCCGCCC 
CTGCGGCGGG 
AGCGTCGTCG 
TCGCAGCAGC 
GCCGACGTCC 
CGGCTGCAGG 
TTCGACGTGA 
AAGCTGCACT 
TTCGCGGTCA 
AAGCGCCAGT 
GCCAACCCCA 
CGGTTGGGGT 
CGTACCGGGG 
GCATGGCCCC 
CAGGTCGCCG 
GTCCAGCGGC 
ACCGGGTGGC 
TGGCCCACCG 
ACCGTCCGCG 
TGGCAGGCGC 
CTGCTCGCCG 
GACGAGCGGC 
GGTGGCACCA 
CCACCGTGGT 
ATCACCACGG 
TAGTGGTGCC 
CGCACCTTCA 
GCGTGGAAGT 
GTGCAGGTCA 
CACGTCCAGT 
TTCGACCTCT 
AAGCTGGAGA 
TCCCCCGGCT 
AGGGGGCCGA 
GTCAACGCCG 
CAGTTGCGGC 
ID N0:3) 



GCCTGGCCGA 
CGGACCGGCT 
ACGCGGAGGA 
TGCGCCTCCT 
CCCTCGCCGA 
GGGAGCGGCT 
ACGTCGCCAC 
TGCAGCGGTG 
TCACCGTCGA 
AGTGGCAGCT 
TCGAGGGCCA 
AGCTCCCGGT 
TCGCCGTCAA 
AGCGGCAGTT 
CGTCCGACGC 
GCAGGCTGCG 
TCGGCGGCAA 
AGCCGCCGTT 
ACGGCGGGTT 
TGCCGCCCAA 
CCGGGACCTT 
GGCCCTGGAA 
CCGGCGTGAA 
GGCCGCACTT 
GGCACACCGC 
CCGTGTGGCG 
ACTGCGGCAC 
TGACGCCGTG 
GCCTGATCCG 
CGGACTAGGC 
GCAACCAGGC 
CGTTGGTCCG 
CGTTCTTCCA 
GCAAGAAGGT 
ACTCGGGCAG 
TGAGCCCGTC 
CCGGGACCCT 
GGCCCTGGGA 
ACCGGTCCGG 
TGGCCAGGCC 
ACGTGCAGCG 
TGCACGTCGC 
CCAACGAGAC 
GGTTGCTCTG 
CGTCCGGCTC 
GCAGGCCGAG 

i 



GAAGCTCGAC 
CTTCGAGCTG 
AGCCGCCGCC 
TCGGCGGCGG 
AGAGCTCGAC 
TCTCGAGCTG 
CACCGACGAG 
GTGGCTGCTC 
GCACTCCCTG 
CGTGAGGGAC 
CGACGACGTG 
GCTGCTGCAC 
GGCCGGAGCC 
CCGGCCTCGG 
CGTGACCTTC 
GCACTGGAAG 
CGCCTACACC 
GCGGATGTGG 
CATCACCGCC 
GTAGTGGCGG 
CGCCGGGTCC 
GCGGCCCAGG 
CCTGCTGGCC 
GGACGACCGG 
GGCCCCCGTC 
CCGGGGGCAG 
CATCACTGCG 
GTAGTGACGC 
CACCACCGTC 
GTGGTGGCAG 
CCAGGGCGTC 
GGTCCCGCAG 
GCCGGTCAAC 
CGGCCAGTTG 
CAGCCCGGCC 
GTCGGGCCGG 
CGCGGCCGGC 
GCGCCGGCCG 
GACCCACAGC 
CTGGGTGTCG 
CTGGAACGGC 
GACCTTGCCG 
CATCACCTAC 
GTAGTGGATG 
CGGTGCCTAC 
GCCACGGATG 



The following DNA sequence (SEQ ID NO:4) encodes the mature protease derived 
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from Cellulomonas strain 69B4 (DSM 16035). 

1 TTCGACGTGA TCGGCGGCAA CGCCTACACC ATCGGGGGGC GCAGCCGCTG CTCGATCGGG 

AAGCTGCACT AGCCGCCGTT GCGGATGTGG TAGCCCCCCG CGTCGGCGAC GAGCTAGCCC 
5 61 TTCGCGGTCA ACGGCGGGTT CATCACCGCC GGCCACTGCG GCCGCACCGG CGCCACCACC 

AAGCGCCAGT TGCCGCCCAA GTAGTGGCGG CCGGTGACGC CGGCGTGGCC GCGGTGGTGG 

121 GCCAACCCCA CCGGGACCTT CGCCGGGTCC AGCTTCCCGG GCAACGACTA CGCGTTCGTC 

CGGTTGGGGT GGCCCTGGAA GCGGCCCAGG TCGAAGGGCC CGTTGCTGAT GCGCAAGCAG 

10 181 CGTACCGGGG CCGGCGTGAA CCTGCTGGCC CAGGTCAACA ACTACTCCGG TGGCCGCGTC 

GCATGGCCCC GGCCGCACTT GGACGACCGG GTCCAGTTGT TGATGAGGCC ACCGGCGCAG 
241 CAGGTCGCCG GGCACACCGC GGCCCCCGTC GGCTCGGCCG TGTGCCGGTC CGGGTCGACC 

GTCCAGCGGC CCGTGTGGCG CCGGGGGCAG CCGAGCCGGC ACACGGCCAG GCCCAGCTGG 
301 ACCGGGTGGC ACTGCGGCAC CATCACTGCG CTCAACTCCT CGGTCACCTA CCCCGAGGGC 

15 TGGCCCACCG TGACGCCGTG GTAGTGACGC GAGTTGAGGA GCCAGTGGAT GGGGCTCCCG 

361 ACCGTCCGCG GCCTGATCCG CACCACCGTC TGCGCCGAGC CCGGCGACTC CGGTGGCTCG 

TGGCAGGCGC CGGACTAGGC GTGGTGGCAG ACGCGGCTCG GGCCGCTGAG GCCACCGAGC 
421 CTGCTCGCCG GCAACCAGGC CCAGGGCGTC ACGTCCGGCG GCTCCGGCAA CTGCCGCACC 

GACGAGCGGC CGTTGGTCCG GGTCCCGCAG TGCAGGCCGC CGAGGCCGTT GACGGCGTGG 
20 481 GGTGGCACCA CGTTCTTCCA GCCGGTCAAC CCCATCCTCC AGGCGTACGG CCTGAGGATG 

CCACCGTGGT GCAAGAAGGT CGGCCAGTTG GGGTAGGAGG TCCGCATGCC GGACTCCTAC 
561 ATCACCACGG ACTCGGGCAG CAGCCCG (SEQ ID NO: 4) 

TAGTGGTGCC TGAGCCCGTC GTCGGGC 

25 

The following DNA sequence (SEQ ID NO:5) encodes the signal peptide derived 
from Cellulomonas strain 69B4 (DSM 16035) 

1 ATGACACCAC CACAGTCAC GCGGGCCCTG GCCGTGGCCA CCGCAGCCGC CACACTCCTG 

TACTGTGGTG CGTGTCAGTG CGCCCGGGAC CGGCACCGGT GGCGTCGGCG GTGTGAGGAC 
30 61 GCAGGCGGCA TGGCCGCCCA GGCC (SEQ ID NO: 5) 

CGTCCGCCGT ACCGGCGGGT CCGG 

The following sequence is the amino acid sequence (SEQ ID NO:6) of the signal 
35 sequence and precursor protease derived from Cellulomonas strain 69B4 (DSM 16035), 
including the signal sequence [segments 1a-c] (residues 1-28 [-198 to -171]), an N-terminal 
prosequence [segments 2a-r] (residues 29-198 [-170 to -1]), a mature protease [segments 
3a-t] (residues 199-387 [1-189]), and a C-terminal prosequence [segments 4a-l] (residues 
388-495 [190-398]) encoded by the DNA sequences set forth in SEQ ID NOS:1 , 2, 3 and 4. 
40 The N-terminal sequence of the mature protease amino acid sequence is in bold. 



1 MTPRTVTRAL AVATAAATLL AGGMAAQA NE PAPPGSASAP PRLAEKLDPD 
la lb lc 2a 2b 2c 

45 51 LLEAMERDLG LDAEEAAATL AFQHDAAETG EALAEELDED FAGTWVEDDV 

2d 2e 2f 2g 2h 

101 LYVATTDEDA VEEVEGEGAT AVTVEH S LAD LEAWKTVLDA ALEGHDDVPT 
2i 2j 2k 21 2m 

151 WYVDVPTNSV WAVKAGAQD VAAGLVEGAD VP 5DAVTF VE TDETPRTM FD 
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2n 2o 2p 2g 2r 

3a 



201 


VI GGNAYT I G 


GRSRCSIGFA 


VNGGF I TAGH 


CGRTGATTAN 


PTGTFAGSSF 




3b 


3C 


3d 


3e 


3f 


251 


PGNDYAFVRT 


GAGVNLLAQV NNYSGGRVQV AGHTAAPVGS 


AVCRSGSTTG 




3g 


3h 


31 


3j 


3k 


301 


WHCGTITALN 


SSVTYPEGTV 


RGLIRTTVCA 


EPGDSGGSLL 


AGNQAQGVTS 




31 


3m 


3n 


3o 


. 3p 


351 


GGSGNCRTGG 


TTFFQPVNPI 


LQAYGLRMIT 


TDSGSSP APA PTSCTGYART 




3g 


3r 


3s 


3t 


4a 4b 


401 


FTGTLAAGRA AAQPNGSYVQ 


VNRSGTHSVC 


LNGPSGADFD 


LYVQRWNGSS 



4c 4d 4e 4f ^ ~ 4g 

451 WVTVAQSTSP GSNETITYRG NAGYYRYWN AASGSGAYTM GLTLP (SEQ ID 
NO: 6) 

4h 4i 4j 4k 41 



The following sequence (SEQ ID NO:7) is the amino acid sequence of the precursor 
protease derived from Cellulomonas strain 69B4 (DSM 16035) ( SEQ ID NO:7). 

1 NEPAPPGSAS APPRLAEKLD PDLLEAMERD LGLDAEEAAA . TLAFQHDAAE 
51 4 TGEALAEELD EDFAGTWVED DVLYVATTDE DAVEEVEGEG ATAVTVEHSL 

101 ADLEAWKTVL DAALEGHDDV PTWYVDVPTN SVWAVKAGA QDVAAGLVEG 

151 ADVPSDAVTF VETDETPRTM FDVI GGNAYT IGGRSRCSIG FAVNGGFITA 

201 GHCGRTGATT ANPTGTFAGS S F PGNDYAF V RTGAGVNLLA QVNNYSGGRV 

251 QVAGHTAAPV GSAVCRSGST TGWHCGT I TA LNS SVTYPEG TVRGLIRTTV 

301 CAEPGDSGGS LLAGNQAQGV TSGGSGNCRT GGTTFFQPVN PILQAYGLRM 

351 ITTDSGSSPA PAPTSCTGYA RTFTGTLAAG RAAAQPNGSY VQVNRSGTHS 

401 VCLNG PSGAD FDLYVQRWNG SSWVTVAQST SPGSNETITY RGNAGYYRYV 

451 VNAASGSGAY TMGLTLP (SEQ ID NO: 7) 



The following sequence (SEQ ID NO:8).is the amino acid sequence of the mature 
protease derived from Cellulomonas strain 69B4 (DSM 16035). The catalytic triad residues 
H32, D56 and S132 are bolded and underlined. 

1 FDVI GGNAYT IGGRSRCSIG FAVNGGFITA GHCGRTGATT ANPTGTFAGS 

51 SFPGNDYAFV RTGAGVNLLA QVNNYSGGRV QVAGHTAAPV GSAVCRSGST 

101 TGWHCGTITA LNS SVTYPEG TVRGLIRTTV CAEPGDSGGS LLAGNQAQGV 

151 TSGGSGNCRT GGTTFFQPVN PILQAYGLRM ITTDSGSSP (SEQ ID NO: 8) 

The following sequence (SEQ ID NO:9) is the amino acid sequence of the signal 
peptide of the protease derived from Cellulomonas strain 69B4 (DSM 16035). 



1 MTPRTVTRAL AVATAAATLL AGGMAAQA (SEQ ID NO:9) 



WO 2005/052146 



PCT/US2004/039066 



-131 - 

The following sequence (SEQ ID NO: 10) is the degenerate primer used to identify a 
177 bp fragment of the protease of Cellulomonas strain 69B4. 

TTGWXCGTJ=W: 5' ACNACSGGSTGGCRGTGCGGCAC 3' (SEQ ID NO:10) 

The following sequence (SEQ ID NO:1 1) is the reverse primer used to identity a 177 
bp fragment of the protease derived from Cellulomonas strain 69B4. 

GDSGGX_RV: . S'-ANGNGCCGCCGGAGTCNCC-S' (SEQIDNO:11) 

The following DNA (SEQ ID NO:13) and amino acid sequence of the 177 bp 
fragment (SEQ ID NO:12) encoding part of the protease gene derived from Cellulomonas 
strain 69B4. The sequences of the degenerate primers (SEQ ID NOS:10 and 11) are 
underlined and in bold. 

DGW DCG TITA LNS SVT YPEG- 
1 ACGACGGCTG GGACTGCGGC AC CATCACTG CGCTCAACTC CTCGGTCACC TACCCCGAGG 

TGCTGCCGAC CCTGACGCCG TGGTAGTGAC GCGAGTTGAG GAGCCAGTGG ATGGGGCTCC 

• TVR G L I RTTV CAE PGD SGGS- 
61 GCACCGTCCG CGGCCTGATC CGCACCACCG TCTGCGCCGA GCCCGGCGAC TCCGGTGGCT 

CGTGGCAGGC GCCGGACTAG GCGTGGTGGC AGACGCGGCT CGGGCCGCTG AGGCCACCGA 

• LLAGNQAQGVTSGDSGGS 
121 CGCTGCTCGC CGGCAACCAG GCCCAGGGCG TCACGTCCGG CGACTCCGGC GGCTCAT 

GCGACGAGCG GCCGTTGGTC CGGGTCCCGC AGTGCAGGCC GCTGAGGCCG CCGAGTA 

Analysis of the Sequence of Cellulomonas sp. 69B4 Protease 

A saturated sinapinic acid (3,5-dimethoxy-4-hydroxy cinnamic acid)("SA") solution in 
a 1:1 v/v acetonitrile ("ACN")/0.1% formic acid solution was prepared. The resulting mixture 
was vortexed for 60 seconds and then centrifuged for 20 seconds at 14,000 rpm. Then, 5pl 
of the matrix supernatant was transferred to a 0.5 ml Eppendorf tube and 1 pi of a 10 
pmole/pl protease 69B4 sample was added to the SA matrix supernatant and vortexed for 5 
seconds. Then, 1 pi of the analyte/matrix solution was transferred onto a sample plate and, 
after being completely dry, analyzed by a Voyager DE-STR (PerSeptive), matrix assisted 
laser desorption/ionization - time of flight (MALDI-TOF) mass spectrophotometer, with the 
following settings: Mode of operation: Linear; Extraction mode: Delayed; Polarity: Positive; 
Accelerating voltage: 25000 V; Extraction delay time: 350 nsec; Acquisition mass range: 
4000- 20000 Da; Number of laser shots: 100/spectrum; and Laser intensity: 2351. The 
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resulttng spectrum is provided in Figure 4. 

A tryptic map was produced using methods known in the art (Christianson et a/., 
Anal. Biochem. 223:119-29 [1994]), modified as described herein. The protease solution, 
containing 1 0 - 50 pg protease was diluted 1:1 with chilled water in a 1 .5 ml microtube. 1 .0 
s N HCI was added to a final concentration of 0.1 N HCI, mixed thoroughly and incubated for 
10 minutes on ice. Then, 50% trichloro-acetic acid ("TCA") was added to a final 
concentration of 10% TCA and mixed. The sample was incubated for 10 minutes on ice, 
centrifuged for two minutes and the supernatant discarded. Then, 1 ml of cold 90% acetone 
was added to resuspend the pellet. The resulting sample was then centrifuged for one 

10 minute, the supernatant quickly decanted and remaining liquid was removed by vacuum 
aspiration. The dry pellet was dissolved in 12 pi of 8.0 M urea solution (480 mg urea 
[Roche, catalog # 1685899]) in 0.65 ml of ammonium bicarbonate solution (final 
concentration of bicarbonate: 0.5 M) and incubated for 3-5 minutes at 37°C. The solution 
was slowing diluted with 48 pi of a n-octyl-beta-D-glucopyranoside solution ("o-water") (200 

15 mg of n-octyl-beta-D-glucopyranoside [C 14 H 2 80 6 , f.w. 292.4] in 200 ml of water). Then, 2.0 
Ml of trypsin (2.5 mg/ml in 1mM HCI) was added and the mixture was incubated for 15 
minutes at 37°C. The proteolytic reaction was quenched with 6 pi of 10% trifluoroacetic acid 
("TFA"). Insoluble material and bubbles were removed from the sample by centrifugation for 
one minute. The tryptic digest was separate by RP-HPLC on 2.1 X 150 mm C-18 column 

20 (5pl particle size, 300 angstroms pore size). The elution gradient was formed from 0.1% 
(v/v) TFA in water and 0.08% (v/v) TFA in acetonitrile at a flow rate of 0.2 ml-min. The 
column compartment was heated to 50°C. Peptide elution was monitored at 215 nm and 
data were collected at 215 nm and 280 nm. The samples were then analyzed on a LCQ 
Advantage mass spectrometer with a Surveyor HPLC (both from Thermo Finnigan). The 

25 LCQ mass spectrophotometer was run with the following settings: Spray voltage: 4.5kV; 
Capillary temperature: 225 9 C. Data processing was performed using TurboSEQUEST and 
Xcalibur (ThermoFinnigan). Sequencing of the tryptic digest portions was also performed in 
part by Argo BioAnalytica. 

Analysis of the full sequence of the asp gene revealed that it encodes a 

30 prosequence protease of 495 amino acids (SEQ ID NO:6). The first 28 amino acids were 
predicted to form a signal peptide. The mass of the mature chain of 69B4 protease as 
produced by Cellulomonas strain 69B4 has a molecular weight of 18764 (determined by 
MALDI-TOF). The sequence of the N-terminus of the mature chain was also determined by 
MALDI-TOF analysis and starts with the sequence FDVIGGNAYTIGGR (SEQ ID NO:17). It 

35 is believed that the 69B4 protease has a unique precursor structure with NH 2 - and COOH 
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terminal pro-sequences, as is known to occur with some other enzymes (e.g., T. aquaticus 
aqualysin I; See e.g., Lee etal., FEMS Microbiol. Lett., 1:69-74 [1994]; Sakamoto etaL, 
Biosci. Biotechnol. Biochem., 59:1438-1443 [1995]; Sakamoto etaL, Appl. Microbiol. 
Biotechnol., 45:94-101 [1996]; Kim etaL, Biochem. Biophys. Res. Commun., 231:535-539 
[1997]; and Oledzka etaL, Protein Expr. Purific, 29:223-229 [2003]). The predicted 
molecular weight of mature 69B4 protease as provided in SEQ ID NO:8, was 18776.42, 
which corresponds well with the molecular weight of the purified enzyme with proteolytic 
activity isolated from Cellulomonas sp. 69B4 (i.e., 18764). The prediction of the COOH 
terminal pro-sequence in 69B4 protease was also based on an alignment of the 69B4 
protease with T. aquaticus aqualysin I, provided below. In this alignment, the amino acid 
sequence of the Cellulomonas 69B4 signal sequence and precursor protease are aligned 
with the signal sequence and precursor protease Aqualysin I of Thermus aquaticus (COOH- 
terminal pro-sequence of Aqualysin I is underlined and in bold). 

Aqualysin I (1) MRKTYWIJ1ALFAVLVI/5GCQMASRSDPTPTLAEAFWPKEAPVYGLD 

69B4 (1) MT PRTVTRALAVATAAATLLAGGMAAQANE PAP PG SAS AP PRL AEKLD PD 

Consensus (1) MA A LLAG A DP P A A PK A D 

51 100 
Aqualysin I (47) DPEAI PGRYI WFKKGKGQS LLQGG I TTLQARIiAPQGVWTQAYTGALQG 

69B4 (51) LLEAMERDLGLDAEEAAATLAFQHDAAETGEALAEE LDEDFAGTWVE 

Consensus (51) EAI L A A Q LA L F G 

101 150 
Aqualysin I (97) FAAEMAPQALEAFRQ S PDVEFI EADKWRAWATQ S PAPWGLDRI DQRDLP 
69B4 (98) DDVL YVATTDEDAVEE VEGEG ATAVT VEH SLADLEAWKTVLDAALiE GHDD 
Consensus (101) E DEAVAA LD 

151 200 
Aqualysin I (147) LSNS YTYTATGRGVNVYVIDTG IRTTHREFGGRARVG YDALGGNGQDCNG 
69B4 (148) VPTWYVDVPTNS - - VWAVKAG AQD VAAGL VEGADVP S DAVT — FVETDE 
Consensus (151) LYT VIG AV DAL D 

201 250 
Aqualysin I (197) H GTHVAGT I GGVTYGVAKAVI^ Y A VRVLDCNG S G ST SGVI AGVDWVTRNH 

69B4 (194) TPRTMFDVIGGNAYTIGGRS RC S IGFAVNGGF I TAGHCGRTG 

Consensus (201) M IGG Y IA C A G R 

251 300 
Aqualysin I (247) RRP AVANM SLGGG VST ALDNAVKN S I AAG WY AVAAGNDNAN ACNY S PAR 

69B4 (236) ATTANPTGTFAGS SF PGNDYAFVRTGAG VNLLAQVNNYSGGR 

Consensus (251) A SAG ADA S AA NAN NYS AR 

301 350 
Aqualysin I (297) VAEALTVGATTS SDARASF SNYGSCVDLF APGASI P S AWYTSDTATQTLN 
69B4 (278) VQ VAGHTAAP VGSAVCRSG STTGWH CGTI T — ALNSSVTYPEGTVRGLIR 
Consensus (301) VAAAS SSG ASYT I 

351 400 
Aqualysin I (347) GT SMATPH VAGVAAL YLEQNPS ATP AS VAS AI LNGATTGRLSGIGSGS PN 
69B4 (326) TTVCAE PGDSGG SLLAGNQ AQGVTSGGS GNCRTGGTTFFQPVNP I LQAYG 
Consensus (351) TAPAGALQ T A A GT A 

401 450 

Aqualysin I (397) RLLYSLLSSGS GSTAPCTSCSYYTGSLSG PGP YNFQ PNGTYYY S P - A 

69B4 (376) LRMI TTD S - G SS PAPAPTSCTG YARTFTGTLAAGRAAAQPNG S YVQVNRS 
Consensus (401) L S S GS TSCS Y S SG G QPNGSY A 

451 500 
Aqualysin 1 (443) GTHRAWLRGPAGTDFDLYLVJRWDGSRWLTVGSSTGPTSEESLSYSGTAGY 
69B4 (425) GTHS VCLNGP SGADFDLYVQRWNGS SWVTVAQSTS PGSNETI T YRGNAGY 
Consensus (451) GTH L GPAG DFDLYL RW GS WLTVA ST P S ESISY G AGY 
501 521 
Aqualysin I (493) YLWRIYAYSGSGMYEFWIiQRP (SEQ ID NO: 644) 

69B4 (475) YRYWNAASGSGAYTMGLTLP (SEQ ID NO: 645) 

Consensus (501) Y W I A SGSG Y LP (SEQ ID NO :646) 



60 



The sequences of three internal peptides of the purified enzyme from Cellulomonas 
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sp. 69B4 having proteolytic activity were determined by MALDI-TOF analysis. All three 
peptides were also identified in the translation product of the isolated asp gene, confirming 
the identification of the correct protease gene (See, SEQ ID NO:1, above). 



5 Percentage Identity Comparison Between Asp and Streptogrisin 

The deduced polypeptide product of the asp gene (mature chain) was used in 
homology analysis with other serine proteases using the BLAST program and settings as 
described in Example 3. The preliminary analyses showed identities of from about 44 - 48% 
(See, Table 4-1, below). Together with analysis of the translated sequence, these results 

10 provided evidence that the asp gene encodes a protease having less than 50% sequence 
identity with the mature chains of Streptogrisin-like serine proteases. An alignment of Asp 
with Streptogrisin A, Streptogrisin B, Streptogrisin C, Streptogrisin D of Streptomyces 
griseus is provided below. In this alignment, the amino acid sequences of Cellulomonas 
69B4 mature protease ("69B4 mature") are aligned with mature proteases amino acid 

15 sequences of Streptogrisin C ("Sq - streptogrisinC_mature"), Streptogrisin B ("Sq - 

streptogrisinBmature"), Streptogrisin A ("Sq - streptogrisinAmature"), Streptogrisin D ( M Sq - 
streptogrisinDmature") and consensus residues. 



20 



25 



69B4 mature 
Sg-StreptogrisinC mature 
Sg-S trep togri s inBmature 
Sg-StreptogrisinAmature 
Sg- S trep togri s inDmatur e 
Consensus 



1 50 

(1) FDVTGGNAYTIGGRSRCSIGFAVN GGFITAGHCGRTGATT 

{ 1 ) ADIRGGDAYYMNGSGRCSVGFSVTRGTQNGFATAGHCGRVGTTTNG — VN 

( 1 ) - - 1 S G GDAI Y S ST - GRC SLGFNVRS G ST YYF LTAGH CTDG ATTWWANS AR 

(1 ) - - 1 AGGEAITTGG - SRCSLGFNVSVNG VAHALTAGHCTNI S ASWS 

( 1 ) - - IAGGDAIWGSG- SRCSIiGFNVVKGGEPYFLTAGHCTESVTSWSD-TQG 
(1) IAGGDAIY G SRCSLGFNV G YFLTAGHCT GTTW 
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Asp mature (41) 

Sg-S trep togri sine mature {49) 

Sg- StreptogrisinBmature {48) 

Sg-StreptogrisinAmature ( 43 ) 

Sg-StreptogrisinDmature {47) 

Consensus (51) 

Asp mature (91) 

Sg-S trep togri sine mature (99) 

Sg-S trep togri s inBmature ( 94 ) 

Sg- S trep togri s inAma ture (90) 

Sg-StreptogrisinDmature ( 97 ) 

Consensus (101) 

Asp mature (140) 

Sg-S treptogri sinC mature (148) 

Sg-StreptogrisinBmature (144) 

Sg-StreptogrisinAmature (140) 

Sg-StreptogrisinDmature (147) 

Consensus (151) 

Asp mature (190) 

Sg-S trep togri sine mature (198) 

Sg-StreptogrisinBmature (186) 

Sg-StreptogrisinAmature ( 182 ) 

Sg-StreptogrisinDmature (189) 

Consensus (201) 



51 100 
ANPTGTFAG S SFPGNDYAFVRTGAGVNLLAQVNNYSGGRVQVAGHTAAPV 
QQAQGTFQGSTFPGRDIAWVATNANWTPRPIiVNGYGRGDVTVAGSTASW 

TTVLGTTSG S SFPNNDYG I VRYTNTTI PKDGTVGG QDITSAANATV 

IGTRTGTSF PNNDYG 1 1 RHSNPAAADGRVYLYNGS YQDITTAGNAFV 

GSE 1 GANEG S SFPENDYGIiVKYTSDTAH PSEVNLYDGSTQ AITQAGDATV 
IGT GSSFP NDYGIVRYTA VN Y G Q IT AG A V 

101 150 
G SAVCRSG STTGWHCGTITALNS SVTYPEG -TVRGL I RTTVCAE PGDSGG 
GASVCRSGSTTGWHOSTIQQLNTSVTYPEG-TISGVTRTSVCAEPGDSGG 
GMAVTRRG S TTGTH S G SVT ALNAT VNYGGGD WYGMI RTNVC AEPGD SGG 
GQAVQRSGSTTGLRSG SVTGLNATVNYG S S G I VYGMI QTNVCAE PGDSGG 
GQAVTRSGSTTQVHDGEVTALDATVNYGNGDIVNGLIQTTVCAEPGDSGG 
G AV RSGSTTG H GSVTALNATVNYG G IV GLI RTTVCAEPGDSGG 
151 200 
SLLAGNQAQGVTSGGSGNCRTGGTTFFQ PVNP I LQAYGLRM ITTDSGS S P 
S YI SGSQAQGVTSGGSGNCS SGGTTYFQPINPLLQAYGLTLVTSGGGTPT 

PL Y SGTRAI GLT SGG SGNCS SGGTTFFQ PVTEALSA YGV S VY 

S LF AG STALG LT S GG S GNCRTGGTTFY Q P VTEAXi S A YG ATVL 

ALF AGDTALG LTSGG SGDC S SGGTTFFQ PVPEALAAYGAE I G 

SLFAGS ALGLTSGG SGNCS SGGTTFFQPV EALSAYGLTVI 

201 250 

DPPTTPPTDSPGGTWAVGTAYAAGATVTYGGATYRCLQAHTAQPGWTPAD 



251 



Asp mature (190) 



(SEQ ID NO: 8) 
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Sg-StreptogrisinC mature (248) VPADWQRV (SEQ ID NO: 639) 

Sg-StreptogrisinBmature (186) (SEQ ID NO: 640) 

Sg-StreptogrisinAmature (182) (SEQ ID NO:641) 

Sg-StreptogrisinDmature (189) (SEQ ID NO: 642) 

5 Consensus (251) (SEQ ID NO: 643) 



Table 4-1. Percentage Identity: Comparison between Cellulomonas sp. 69B4 Protease 
10 Encoded by asp and Other Serine Proteases (identity between the mature chains) 





Streptogrisin A 
S. griseus 


Streptogrisin B 
S. griseus 


Streptogrisin C 
S. griseus 


Streptogrisin D 
S. griseus 


Alphalytic 
endopeptidase 
Lysobacter 
enzymoaenes 


Asp protease 
Cellulomonas sp. 
Isolate 69B4 


48% 


45% 


47% 


46% 


44% 



Additionnel protease sequences were also investigated. In these analyses, 
proteases homologous in protein sequence to the mature domain of ASP were searched for 
using BLAST. Those identified were then aligned using the multiple sequence alignment 
15 program clustalW. The numbers on the top of the alignment below refer to the amino-acid 
sequence of the mature ASP protease. The numbers at the side of the alignment are 
sequence identifiers, as described at the bottom of the alignment. 



20 Sequence 1 10 2 0 30 40 

ASP FDVI GGNAYTI GGRSRC S I GF AVN GGF I TAGHCGRTGATTANPTG TF 

2 TPLI AGGEAITTGGSRC SLGFNV- SVNGVAHALTAGHCTNI SASWS IGTR 

3 — I AGGEAI YAAGGGRC SLGFNVRSS SGATYALTAGHCTEIASTWYTNSGQTSL — LGTR 

4 NKLI QGGDAI YAS SWRC SLGFNVRTS SGAEYFLTAGHCTDGAGAWRAS SGGTV IGQT 

25 5 NKL I QGGDAI YAS SWRC SLGFNVRTS SGAEYFLTAGHCTDGAGAWRAS SGGTV IGQT 

6 TKL I QGGDAI YAS SWRC SLGFNVRSS SGVDYFLTAGHCTDGAGTWY SNSARTTA — IGST 

7 TKLI SGGDAIYSSTGRC SLGFNVRSGS-TYYFLTAGHCTDGATTWWANSARTTV — LGTT 

8 VLGGGAI YGGGSRC SAAFNV- TKGGARYFVTAGHCTNI SANWSAS SGGS V VGVR 

9 QREVAGGDAI YGGGSRC SAAFNV- TKNGVRYFLTAGHCTNLSSTWSSTSGGTS IGVR 

30 10 KPFIAGGDAITGNGGRCSLGFNVTKG-GEPHFLTAGHCTEGISTWSDSSG — QV — I GEN 

11 KPFVAGGDAITGGGGRCSLGFNVTKG-GEPYFITAGHCTESISTWSDSSG — NV — I GEN 

12 TPLIAGGDAIWGSGSRCSLGFNWKG-GEPYFLTAGHCTESVTSWSDTQGG-SE — IGAN 

1 3 KTFASGGDAI FGGGARC SLGFNVTAGDGSAAFLTRGHCGGGATMWSDAQGGQP I - -ATVD 

14 * KTFASGGDAI FGGGARC SLGFNVTAGDGSPAFLTAGHCGVAADQWSDAQGGQPI - -ATVD 
35 15 

1 6 TTRLNGAEPILSTAGRCSAGFNVTDG-TSDFILTAGHCGPTGSVWFGDRPGDGQ- - VGRT 

17 ATVQGGDVYYINRS SRC S I GFAVT TGFVSAGHCGGSGASATTSSGEAL GTF 

1 8 ADIRGGDAYYMNGSGRC S VGF SVTRG- TQNGFATAGHCGRVGTTTNGVNQQAQ GTF 

19 YDLRGGEAYYINNS SRC S I GF P ITKG - TQQGFATAGHCGRAG S STTGANRVAQ GTF 

40 20 YDLVGGDAY YIGN- GRC S I GF SVRQG- STPGFVTAGHCGSVGNATTGFNRVSQ GTF 

2 1 YDLVGGD AYYMGG - GRC S VGF SVTQG - STPGFATAGHCGTVGTSTTGYNQAAQ GTF 

22 EDLVGGDAYYIDDQARC S I GF SVTKD - DQEGF ATAGHCGDPGATTTGYNEADQ GTF 

23 LAAI I GGNP YYFGNYRC S I GF SVRQG- SQTGF ATAGHCG STGTRVS S PSG TV 

2 4 ANI VGGI EYS INNASLC S VGF SVTRG - ATKGFVTAGHC GTVNATARI GGAW GTF 

45 25 AAGTVGGDPYYTGNVRC S IGFSVH GGFVTAGHCGRAGAGVSGWDRSYI GTF 

2 6 VI VPVRDYWGGDALSGCTLAFPVYGG FLTAGHC AVEGKGH ILKTEMTGGQ- IGTV 

2 7 DPPLRSGLAIYGTNVRCS SAFMAYSG- S S YYMMTAGHC AEDS SYWE VPTYS YGYQGVGHV 
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50 60 70 80 90 100 

ASP AGS SFPGN- DYAFVRTGAGVNLLAQVNNYSGGR- VQVAGHTAAPVGS AVCRSGSTTGWHC 

2 TGTSFPNNDYG 1 1 RH SNPAAA — DGRVYLYNGSYQDITTAGNAFVGQAVQRSGSTTGLRS 

3 AGTSFPGNDYGLIRHSNASAA — DGRVYLYNGSYRDITGAGNAYVGQTVQRSGSTTGLHS 
5 4 AGS SF PGNDYGI VQYTGS VSRPGTANGVDITRAATPSVGTTVIRDGSTTGTHS 

5 AGS SF PGND YG I VQYTG S VSRPGTANGVDITRAATPSVGTTVIRDGSTTGTHS 

6 AGSSFPGNDYGIVRYTGS VSRPGTANGVD ITRAATPSVGTTVI RDGSTTGTH S 

7 SGSSFPNNDYGIVRYTNTT 1 PKDGTVGGQD ITS AANATVGMAVTRRGSTTGTHS 

8 EGTSFPT1TOYGIVRYTIX3SSP--AGTVDLYNGSTQDISSAANAVVGQAIKKSGSTTKVTS 
10 9 EGT SF PTNDYGI VRYTTTTNV- - DGRVNL YNGGYQD I ASAADAVVGQAIKK SGSTTKVTS 

1 0 AAS SF PGDDYGLVK YTADVAH - - PSQVNLYDGSSQSI SGAAEAAVGMQVTRSGSTTQVHS 

11 AAS S F PDNDYGLVKYTADVDH — P SEVNL YNG S SQAI SGAAEATVGMQVTRSGSTTQVHD 

12 EGS S F PENDYGLVKYTSDTAH — P SEVNL YDG STQAITQAGDATVGQAVTRSGSTTQVHD 

13 QAVFPPEGDFGLVRYDGPSTE — APSEVDLGDQTLPISGAAEASTVGQEVFRMGSTTGLAD 
15 14 QAVFPGEGDFALVRYDDPATE — APSEVDLGDQTLPISGAAEAAVGQEVFRMGSTTGLAD 

15 

1 6 VAG SF PGDDF SLVEYANGKAGDGADVVAVGDGKGVRITGAGEP AVGQRVFRSGSTSGLRD 

1 7 SGSW PGSADMAYVRTVSGTVLRGYINGYGQGS - F PVSGS SEAAVGAS I CRSGSTTQVHC 

18 QG STF PGR- DI AWVATNANWTPRPLVNGYGRGD - VTVAGSTASVVGASVCRSGSTTGWHC 
20 19 QG S I F PGR - DMAWVATNS S WTATPYVLGAGGQN- VQVTGSTAS PVGAS VCRSGSTTGWHC 

2 0 RGSWFPGR-DMAWAVNSNWTPTSLVRNSGSG — VRVTGSTQATVGSSICRSGSTTGWRC 

2 1 EESSFPGD-DMAWVSVNSDWNTTPTVNEGE VTVSGSTEAAVGAS ICRSGSTTGWHC 

2 2 QASTF PGK - DMAWVGVNSDWTATPDVKAEGGEK - IQLAGSVEALVGASVCRSGSTTGWHC 

2 3 AGSYF PGR- DMGWVRITS ADTVTPLVNRYNGGT- VTVTGSQEAATGS SVCRSGATTGWRC 

25 24 AARVF PGN- DRAWVSLTSAQTLLPRVANGS SF — VTVRGSTEAAVGAAVCRSGRTTGYQC 

2 5 QGS SF PDN- DYAWVSVGS GWWTVPWLGWGTV S DQLVRGSNVAPVGAS ICRSGSTTHWHC 

2 6 EASQFGDG I DAAWAKNYGDWNGRGRVTHWNGGGGVD I KGSNEAAVGAHMCKSGRTTKWTC 

27 ADYTFGYYGDSAIVRVDDPGF WQPRGWVY PSTRI TNWDYDYVGQYVCKQGSTTGYTC 

30 110 120 130 140 150 

AS P GT I TALNS S VTYPEGTV- RGL I RTTVC AEPGD SGGSLLAGN- Q AQGVTSGGS 

2 GSVTGLNATVNYGSSGIVYGMIQTNVCAEPGDSGGSLF-AGSTALGLTSGGS 

3 GRVTGLNATVNY GGGDI VSGL I QTNVC AE PGD S GGALF - AG STALGLT SGGS 

4 GRVTALNATVNYGGGDWGGL I QTTVC AE PGDSGGSLYGSNGTAYGLTSGGS 

35 5 GRVTT^LNATVNYGGGDWGGIj I QTTVC AEPGD SGGSLYG SNGTAYGLTSGG S 

6 GRVTAIjNATVNYGGGD I VSGD I QTTVC AEPGD SGGPLYG SNGTAYGLTSGG S 

7 G SVTALNATVNYGGGDWYGMI RTNVC AEPGD SGG PLY- SGTRAIGLTSGGS 

8 GTVTAVNVTVNYGDGP-VYNMGRTTACSAGGDSGGAHF-AGSVALGIHSGSS 

9 GTVSAVNVTVNYSDGP-VYGMVRTTACSAGGDSGGAHF-AGSVALGIHSGSS 

40 10 GTVTGLDATVNYGNGDIVNGLI QTDVC AEPGDSGG SLF SGDK- AVGLTSGGS 

11 GTVTGLDATVNYGNGDIVNGLIQTDVCAEPGDSGGSLFSGDQ-AIGLTSGGS 

12 GEVTALDATVNYGNGDIVNGLIQTTVCAEPGDSGGALFAGDT-ALGLTSGGS 

1 3 GQVLGLDVTVNYPEG-TVTGLI QTDVC AEPGD SGGSLFTRDGLAIRLTSGGT 

14 GQVLGLDATVNYPEG-MVTGLIQTDVCAEPGDSGGSLFTRDGLAIGLTSGGS 

45 15 VDGL I QTDVC AE PGD SGGALFDGDA- AI GLTSGGS 

16 GRVTALDATVNYPEG-TVTGLIETDVCAEPGDSGGPMFSEGV-ALGVTSGGS 

17 GTI GAKGATVNYPQGAV- SGLTRTSVC AEPGDSGG S FYSG S - QAQGVTS GGS 

18 GTIQQLNTSVTYPEGTI-SGVTRTSVCAEPGDSGGSYISGS-QAQGVTSGGS 

19 GTVTQLNTSVTYQEGTI - S PVTRTTVC AEPGDSGGSF I SGS - Q AQGVTSGGS 

50 20 GTIQQHNTSVTYPQGTI-TGVTRTSACAQPGDSGGSFISGT-QAQGVTSGGS 

21 GTIQQHNTSVTYPEGTI-TGVTRTSVCAEPGDSGGSYISGS-QAQGVTSGGS 

2 2 GTIQQHDTSVTYPEGTV-DGLTETTVCAEPGDSGGPFVSGV- QAQGTTSGGS 

23 GTI QSKNQTVRYAEGTV- TGLTRTTAC AEGGDSGGPWLTG S - Q AQGVT SGGT 

2 4 GTI TAKNVTANYAEGAV- RGLTQGNACMGRGDSGG SWITS AGQ AQGVMS GGNVQ SNGNNC 

55 25 GTVLAHNETVNYSDGSWHQLTKT SVC AEGGD SGG SF I SGD-QAQGVTSGGW 

26 GYLLRKDVSVNYGNGHI - VTLNETSACALGGDSGGAYVWND- QAQG ITSGSN 

27 GQITETNATVSYPGRTL-TGMTWSTACDAPGDSGSGVYDGSTAHGILSGGPN 

160 170 180 189 

60 ASP GNCRTGGTTFFQPVNPILQAYGLRMITTDSGSSP (SEQ ID NO: 18) 

2 GNCRTGGTTF YQ PVTEALSAYGATVL (SEQ ID NO: 19) 

3 GNCRTGGTT (SEQ ID NO: 20) 

4 GNC S SGGTTFFQPVTEALS AYGVS VY (SEQ ID NO: 21) 

5 GNC S SGGTTFFQPVTEALS AYGVS VY (SEQ ID NO: 22) 

65 6 GNC S SGGTTFFQPVTEAL S AYGVSVY (SEQ ID NO: 23) 

7 GNC S SGGTTFFQPVTEAL SAYGVSVY (SEQ ID NO: 24) 

8 GC S GTAGS AIHQ PVTKALS AYGVTVYL (SEQ ID NO: 25) 
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9 GCTGTNG SAIHQ PVREAL SAYGVNVY 

1 0 GDCTSGGTTFFQPVTEALSATGTQIG 

11 GDCTSGGETFFQPVTEAIiSATGTQIG 

1 2 GDC S SGGTTFFQPVPEALAAYGAEIG 

1 3 RDCTSGGETFFQPVTTALAAVGGTLGGEDGGDG- 

1 4 GDCTVGGETFFQPWTALAAVGATLGGEDGGAGA 

1 5 GDCSQGGETFFQPVTEALKAYGAQIGGGQGEPPE 

1 6 GDCAKGGTTFFQPLPEAMASLGVRLIVPGREGAA 

17 GDCSRGGTTYFQPVNRILQTYGLTLVTA 

18 GNCSSGGTTYFQPINPLLQAYGIiTLVTSGG — GT 

19 GDCRTGGETFFQPINALLQNYGLTLKTTGGDDGG 

2 0 GNC SIGGTTFHQPVNPI LSQYGLTLVRS 

2 1 GNCTSGGTTYHQPINPLLS AYGLDLVTG 

2 2 GDCTNGGTTFYQPVNPLLSDFGLTLKTTSA 

2 3 GDCRSGGITFFQPINPLLSYFGLQLVTG 

24 GIPASQRSSLFERLQPILSQYGLSLVTG 

2 5 GNC S SGGETWFQ PVNEI LNRYGLTLHTA 

2 6 -MDTNNC R S FYQ P VNTVLNKWKL S L VT S TD VTT S 
27 SGCGMIHEPI SRALADRGVTLLAG 

In the above listing, the numbers correspond as follows: 

1 ASP Protease 

2 Streptogrisin A (Streptomyces griseus) 

3 Glutamyl endopeptidase (Streptomyces fradiae) 

4 Streptogrisin B (Streptomyces lividans) 

5 . SAM-P20 (Streptomyces coelicolor) 

6 SAM-P20 (Streptomyces albogriseolus) 

7 Streptogrisin B (Streptomyces griseus) 

8 Glutamyl endopeptidase II (Streptomyces griseus) 

9 Glutamyl endopeptidase II (Streptomyces fradiae) 

10 Streptogrisin D (Streptomyces albogriseolus) 

1 1 Streptogrisin D (Streptomyces coelicolot) 

12 Streptogrisin D (Streptomyces griseus) 

1 3 Subfamily S1 E unassigned peptidase (SalO protein) (Streptomyces lividans) 

14 Subfamily S1 E unassigned peptidase (SALO protein) (Streptomyces coelicolor) 

15 Streptogrisin D (Streptomyces platensis) 

1 6 Subfamily S1 E unassigned peptidase (3SC5B7. 1 0 protein)(Sfreptomyces coelicolor) 

17 CHY1 protease (Metarhizium anisopliae) 

1 8 Streptogrisin C (Streptomyces griseus) 

1 9 Streptogrisin C (SCD40A.16c protein) (Streptomyces coelicolor) 

20 Subfamily S1 E unassigned peptidase (I) (Streptomyces sp.) 

21 Subfamily S1 E unassigned peptidase (II) (Streptomyces sp.) 

22 Subfamily S1 E unassigned peptidase (SCF43A. 1 9 prote\n)(Streptomyces coelicolor) 

23 Subfamily S1 E unassigned peptidase (Thermobifida fusca; basonym 



(SEQ ID NO: 26) 
(SEQ ID NO: 27) 
(SEQ ID NO: 28) 
(SEQ ID NO: 29) 
(SEQ ID NO: 30) 
(SEQ ID NO: 31) 
(SEQ ID NO: 32) 
(SEQ ID N0:33) 
(SEQ ID NO: 34) 
(SEQ ID NO: 35) 
(SEQ ID NO: 36) 
(SEQ ID NO: 37) 
(SEQ ID NO: 38) 
(SEQ ID NO: 39) 
(SEQ ID NO: 40) 
(SEQ ID NO: 41) 
(SEQ ID NO: 42) 
(SEQ ID NO: 43) 
(SEQ ID NO: 44) 
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Thermomonospora fusca) 

24 Aipha-lytic endopeptidase (Lysobacter enzymogenes) 

25 Subfamily SI E unassigned peptidase (SC1 0G8.1 3C protein) (Streptomyces 
coelicolot) 

26 Yeast-lytic endopeptidase (Rarobacter faecitabidus) 

27 Subfamily S1 E unassigned peptidase (SC1 0A5.1 8 protein) (Streptomyces coelicolor) 



EXAMPLE 5 

Screening for Novel Homologues of 69B4 Protease by PCR 

In this Example, methods used to screen for novel homologues of 69B4 protease are 
described. Bacterial strains of the suborder Micrococcineae, and in particular from the 
family Cellulomonadaceae and Promicromonosporaceae were ordered from the German 
culture collection, DSMZ (Braunschweig) and received as freeze dried cultures. Additional 
strains were received from the Belgian Coordinated Collections of Microorganisms, 
BCCM™/LMG (University of Ghent). The freeze-dried ampoules were opened according to 
DSMZ instructions and the material rehydrated with sterile physiological saline (1.5 ml) for 
1h. Well-mixed, rehydrated cell suspensions (300 \iL) were transferred to sterile Eppendorf 
tubes for subsequent PCR. 

PCR Methods 

i) Pretreatment of the Samples 

The rehydrated microbial cell suspensions were placed in boiling water bath for 10 
min. The suspensions were then centrifuged at 16000 rpm for 5 min. (Sigma 1-15 
centrifuge) to remove cell debris and remaining cells, the clear supernatant fraction serving 
as template for the PCR reaction. 

(ii) PCR Test Conditions 

The DNA from these types of bacteria (Actinobacteria) is characteristically highly GC 
rich (typically >55 mol%), so addition of DMSO is a necessity. The chosen concentration 
based on earlier work with the Cellulomonas sp. strain 69B4 was 4% v/v DMSO. 

(iii) PCR Primers (chosen from the following pairs) 
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Prot-int_FW1 
Prot-int_RV1 

Prot-int_FW2 
Prot-int_RV2 

Cellu-FW1 
Cellu-RV1 



5'-TGCGCCGAGCCCGGCGACTC-3' (SEQ ID NO:45) 
5'-GAGTCGCCGGGCTCGGCGCA-3' (SEQ ID NO:46) 

5'-TTCCCCGGCAACGACTACGCGTGGGT-3' (SEQ ID NO:47) 
5'-ACCCACGCGTAGTCGTTGCCG GGG AA-3' (SEQ ID NO:48) 

5'-GCCGCTGCTCGATCGGGTTC-3' (SEQ ID NO:49) 
5'-GCAGTTGCCGGAGCCGCCGGACGT-3' (SEQ ID NO:50) 



(iv) PCR Mixture (all materials supplied by Invitrogen) 



Template DNA 4|jl 

10x PCR buffer 5ul 

50mM MgS04 2ul 

lOmMdNTP's 1 pi 

Primers (10jxM soln.) 1 ul each 
Platinum Taq hifi polymerase 0.5ul 

DMSO 2ul 

MilliQ water 33.5ul 



(v) PCR Protocol 

1) 94° C 5 min 

2) 94°C 30 sec 

3) 55°C 30 sec 

4) 68°C 3 min 

5) Repeat steps 2-4 repeat for 29 cycles 

6) 68°C 10 min 

7) 15°C 1 min 

The amplified PCR products were examined by agarose gel electrophoresis. Distinct 
bands for each organism were excised from the gel, purified using the Qiagen gel extraction 
kit, and sequenced by BaseClear, using the same primer combinations. 



(vi) Sequence Analysis 

Nucleotide sequence data were analyzed and the DNA sequences were translated 
into amino acid sequences to review the homology to 69B4-mature protein. Sequence 
alignments were performed using AlignX, a component of Vector NTI suite 9.0.0. The 
results are compiled in Table 5-1. The numbering is that used in SEQ ID NO:8. 



Table 5-1. Percent Identity of (translated) Amino Acid Sequences found 
in Natural Isolate Strains Compared to 69B4 Mature Protease 
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ivj icroorgan isrn 


NO. OT 
mum no 
Acids 


Overlap 
Position 


% IHontSt\/ 


Cellulomonas flavinpna DSM 201 09 


101 


34 - 134 


62 


Cellulomonas biazotea DSM 201 12 


1 14 


26 - 139 


68 


Dp/// itnmnna<z fimi nmv/I9u1 13 

\-/atuiiJi 1 iuo in in vj o i v i^.v_/ i iu 


109 


32 - 140 


72 




48 


142 - 189 


69 


Cellulomonas iranensis DSM 14785 


85 


52 - 123 


66 


Cellulomonas cellasea DSM 20109 


102 


32 - 133 


63 


Cellulomonas xylaniiytica LMb *n 7.<o 


14o 


H H CO 
1 o - 1 OO 




KjGrSKOvia luroaia uoiv. dsjoi / 


111 


34- 144 


74 


KJGi&t\UVIcl yc?/7c?/7o/o L/OIVI *fUUUU 


129 


22- 150 


70 




134 


35- 168 


53 


Promicromono^nora citrsa DSM 431 10 


85 


52-136 


75 


Promicromonospora sukumoe DSM 
44121 


85 


52-136 


73 


Xylanibacterium ulmi LMG 21721 


141 


16-156 


64 


Streptomyces griseus ATCC 27001 


No PCR product detected homologous 
to 69B4 protease 


Streptomyces griseus ATCC 1 01 37 


Streptomyces griseus ATCC 23345 


Streptomyces fradiae ATCC 14544 


Streptomyces coelicolor ATCC 1 0 1 47 


Streptomyces lividans TK23 



These results show that PCR primers based on polynucleotide sequences of the 
69B4 protease gene (mature chain), SEQ ID NO:4 are successful in detecting homologous 
genes in bacterial strains of the suborder Micrococcineae, and in particular from the family 
Cellulomonadaceae and Promicromonosporaceae. 

Figure 2 provides a phylogeny tree of ASP protease. The phytogeny of this protease 
was examined by a variety of approaches from mature sequences of similar members of the 
chymotrypsin superfamily of proteins and ASP homologues for which significant mature 
sequence has been deduced. Using protein distance methods known in the art (See e.g., 
Kimura, The Neutral Theory of Molecular Evolution , Cambridge University Press, 
Cambridge, UK [1983]) similar trees were obtained either including or excluding gaps. The 
phylogenetic tree of Figure 2 was constructed from aligned sequences (positions 16 -181 of 
SEQ ID NO:8) using TREECONW v.1.3b (Van de Peer and De Wachter, Comput. Appl. 
Biosci., 10:569 - 570 [1994]) and with tree topology inferred by the Neighbor-Joining 
algorithm (Saitou and Nei, Mol. Biol. Evol., 4:406 - 425 [1987]). As indicated by this tree, the 
data indicate that the ASP series of homologous proteases ("cellulomonadins") forms a 
separate subfamily of proteins. In Figure 2, the numbers provided in brackets correspond to 
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the sequences provided herein. 

The following is an alignment between the Cellulomonas 69B4 ASP protease and 
homologous proteases of related genera described herein. 



69B4 (ASP) complete 
Cellulomonas gelida 
Cellulomonas flavigena 
Cellulomonas biazotea 
10 Cellulomonas fimi 

Cellulomonas Iran ens is 
Cellulomonas cellasea 
C. xylanilytica 
Oerskovia turbata 
15 Oerskovia jenensis 

Cm. cellulans 
Pm. citrea 
Pm . sukumoe 
69B4 (ASP) mature 
20 Consensus 



25 



30 



35 



40 



69B4 (ASP) complete 
Cellulomonas gelida 
Cellulomonas flavigena 
Cellulomonas biazotea 
Cellulomonas fimi 
Cellulomonas iranensis 
Cellulomonas cellasea 
C. xylanilytica 
Oerskovia turbata 
Oerskovia jenensis 
Cm. cellulans 
Pm. citrea 
Pm. sukumoe 
69B4 (ASP) mature 
Consensus 



69B4 (ASP) complete 
Cellulomonas gelida 
Cellulomonas flavigena 
Cellulomonas biazotea 
Cellulomonas fimi 
45 Cellulomonas iranensis 
Cellulomonas cellasea 
C. xylanilytica 
Oerskovia turbata 
Oerskovia jenensis 
50 Cm. cellulans 

Pm. citrea 
Pm. sukumoe 
69B4 (ASP) mature 
Consensus 

55 

69B4 (ASP) complete 
Cellulomonas gelida 
Cellulomonas flavigena 
60 Cellulomonas biazotea 
Cellulomonas fimi 
Cellulomonas iranensis 
Cellulomonas cellasea 
C. xylanilytica 
65 Oerskovia turbata 

O.jenenensis revi 
Cm. cellulans 
Pm. citrea 
Pm. sukumoe 
70 69B4 (ASP) mature 

Consensus 



1 50 

( 1 ) MTPRTVTRALAVATAAATLLAGGMAAQANEPAPPGSASAPPRIAEKLDPD 

(1) 

(!) 

(1 ) 

(1) 

(1 ) 

(1) 

(1) 

(1 ) MARS FWRTIiATACAATALVAG PAALTANAATPTPDTPTVS PQTS SKVS PE 

(1) 

(1) 

(!) — 

(1) 

(1) 

(1) 

51 100 

( 51 ) LLEAMERDLGLDAEEAAATLAFQHDAAETGEALAEELDEDF-AGTWVEDD 

(1) 

(1) 

(!) 

(!) 

(1) 

(1) v 

(1) 

(51) VLRALQRDLGLSAKDATKRLAFQSDAASTEDALADSLDAYAGAWVDPARN 
(1) 

(1) PRAAGRAARS SG SRAS AS 

(1) ; 

(1) 

(1) 

(51) 

101 150 

(100) VL YVATTDEDA VE E VE GEGATA VTVEH S LADLEAWKT VLDAALEG HDD VP 

(!) 

(1) 

. (1) KQTASEFVTRLTIGELNLAAANSPLPIGHAWSTAL 

(1) 

(1) 

( 2 ) GRVRQLPLRGHDVLPARERDPAGLRSASRPGLTRSRRARLDAAGPSARVA 
(1) 

(101) TL YVGVADRAEAKE VRS AGAT PVVVDHTLAELDTWKAAIiDGELNDPAGVP 
(1) 

(19 ) TS PGPTS VTAS AS S CGRATGRRQRWTPEADGTVRAGGKCMDVAWAPRPTA 

(1) 

(1) 

<lj 

(101) 

151 200 

(150) TWYVD VPTN S VVVAVKAG AQDVAAGL VEG AD VP S DAVTF VETDET PRTMF 
(1) J 

(1) V 

(36) GWYVDVTTNTWVNATALAVAQATEIVAAAT^ 

(1) v 

(1) 

(52) AWYVDVPTNKLVVE SVG — DTAAAADAVAAAGL PADAVTLATTE APRTFV 
(1) 

(151) SWFVDVTTNQVVVNVHDGGRALAELAAASAGVP 

(1, 

( 69 ) RR S S SRTARQRG PEVRAQRRGRPRVGAGEQS ASTPPGAHRGTRGAVRAHG 

(1) 

(1) 

(1) F 

(151) 



201 



250 
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69B4 (ASP) complete (200) 

Cellulomonas gelida (1) 

Cellulomonas flavigena (2) 

Cellulomonas biazotea (86) 

C. fimi. revi (2) 

C. iranensis revi (1) 

Cellulomonas cell as ea (100) 

C. xylanilytica (1) 

Oerskovia turbata (201) 

Oerskovia jenensis (1) 

Cm. cellulans (119) 

Pm. citrea (1) 

Pm. sukumoe (1) 

69B4 (ASP) mature (2) 

Consensus (201) 



DVI GGNAYTI GGRSR- 



-CS IGFAVNGGFITAGHCGRTGA- 



-TTA 



DVIGGNAYYIGSRSR- 
DVT GGNR YRI NNT S R - 
DVI GGDAYYI GGRSR- 



-CSIGFAVEGGFVTAGHCGRAGA STS 

- C S VGFAVSGGF VTAGHCGTTGA TTT 

- C S I GFAVTGGF VTAGHCGRTG A ATT 



DVIGGNAYYINASSR- 
p~ 

DVVGGNAYTMG SGGR - 

R- 



-CSVGFAVEGGFVTAGHCGRAGA- 
-CSIGFAVTGGFVTAGHCGRSGA- 
- C S VGFAVNGGFI TAGHCG S VGT - 
-CSVGFAVNGGFVTAGHCGTVGT- 



-STS 
-TTT 
-RTS 
-RTS 



DVRGGDRY ITRDPGAS SG S ACS I G YA VQGGF VTAGHCGRGGTRRVLTASW 



DVTGGNAYTIGGRSR- 
DVIGG Y I R 



-CSIGFAVNGGFITAGHCGRTGA- 
CSIGFAV GGFVTAGHCGR GA 



-TTA 
TS 



69B4 (ASP) complete (240) 

Cellulomonas gelida (1) 

Cellulomonas flavigena (42) 

Cellulomonas biazotea (126) 

Cellulomonas fimi (42) 

Cellulomonas iranensis (1) 

Cellulomonas cellasea (140) 

C. xylanilytica (27) 

Oerskovia turbata (241) 

Oerskovia jenensis (27) 

Cm. cellulans (169) 

Pm. citrea (1) 

Pm . sukumoe { 1 ) 

69B4 (ASP) mature (42) 

Consensus (251) 



251 300 
NPTGTFAG S SFPGND YAFVRTGAGVNLI^Q VKNYSGGRVQVAGHTAAPVG 

SPSGTFRGSSFPGNDYAWVQVASGNTPRGLVNNHSGGTVRVTGSQQAAVG 
KPSGTFAG S SF PGNDYAWVllVASGNTPVGAvTJ^JYSGGTVAVAGSTQATVG 
SPSGTFAGSSFPGNDYAWVRVASGNTPVGAVNNYSGGTVAVAGSTQAAVG 

FPGNDYAWVQVGSGDTPRGLVNNYAGGTVRVTGSQQAAVG 

SPSGTFRGS SF PGOTYAWVQVASGNTPRGLVWNHSGGTVR VTGSQQAAVG 
SPSGTFAGS SF PGNDYAWVRAASGNTPVGAVNRYDGSRVTVAGSTDAAVG 
GPGGTFRGSOTPGNDYAWVQVDAGNTPVGAV3SINYSGGRVAVAGSTAAPVG 
GPGGTFRG S SFPGNDYAWVQVI3AGNTPVGAVNNYSGGRVAVAGSTAAPVG 
ARMGT VQ AASF PGHDYAWVRVDAGF S P VPRVNNYAGGT VDVAG S AE AP VG 

F PGND Y AWYNTGTDDTLVG AVNNY S G GTVNVAG S TRAAVG ' 

F PGND Y AWVNVG S DDT P I GAVNNY S GGTVNVAG S TQ AAVG 

NPTGTFAG S SFPGNDYAFTOTGAGVTJIiIiAQVNrJYSGGRVQ VAGHTAAPVG 
P GTF GS SFPGND YAWVQVASGNTPVGAVNNYSGGTV VAGST AAVG 



69B4 (ASP) complete 
Cellulomonas gelida 
Cellulomonas flavigena 
Cellulomonas biazotea 
Cellulomonas fimi 
Cellulomonas iranensis 
Cellulomonas cellasea 
C. xylanilytica 
Oerskovia turbata 
Oerskovia jenensis 
Cm. cellulans 
Pm. citrea 
Pm . sukumoe 
69B4 (ASP) mature 
Consensus 



301 350 

(290) SAVCRSG STTGWHCGT I TALNS SVTYPEGTVRGLIRTTVCAE PGDSGGSL 
(1) 

(92) S WCRSGSTTGWRCGYVRAYNTTVRYAEGSVSGLIRTS VCAEPGDSGGSL 
(176) ASVCRSGSTTGWRCGTIQAFNSTVNYAQGSVSGLIRTNVCAEPGDSGGSL 

( 92 ) ATVCRSGSTTGWRCGTIQAFNATVNYAEGSVSGLIRTNVCAEPGDSGGSL 

(41) AYVCR SG STTGWRCGTVQAYNAS VRYAEGTVSGLI RTNVCAE PGD 

(190) SWCRSGSTTGWRCGYV1UVYNTTVRYAEGSVSGLIRTSVCAEPGDSGGSL 

(77) AAVCRS G STTAWG CGT I Q S RG AS VTYAQGTVS Gh I RTNVCAE PGDSGG SL 

(291) ASVCRSGSTTGWHCGTIGAYNTSVTYPQGTVSGLIRTNVCAEPGDSGGSIj 
( 77 ) SSVOTSGSTTGWRCGTIAAYNSSVTYPQGTVSGLIRTNVCAEPGDSGGSIi 

(219) ASVCRSGATTGWRCGVT EQKNI TVNYGNGDVPGLVRGS ACAEGGDSGGSV 

(41) ATOCRSG STTGWHCGTI QALNAS VT YAEGTVSGLIRTNVCAEPGD 

(41) S TVCR S G S TTGWH CGT I Q AFN ASVTYAEGT VS GL I RTNVCAE PGD. 

(92) SAVCRSGSTTGWH03TITALNSSVTYPEGTVRGLIRTTVCAEPGDSGGSL 

(301) ASVCRSGSTTGWRCGTI AYNASV YAEGTVSGLIRTNVCAEPGDSGGSL 



69B4 (ASP) complete 
Cellulomonas gelida 
Cel lul omonas f 1 avi gena 
Cellulomonas biazotea 
Cellulomonas fimi 
Cellulomonas iranensis 
Cellulomonas cellasea 
C. xylanilytica 
Oerskovia turbata 
Oerskovia jenensis 
Cm. cellulans 
Pm. citrea 
Pm . sukumoe 
69B4 (ASP) mature 
Consensus 



351 400 

( 340 ) LAGNQAQGVTSGG SGNCRTGGTTFFQP VNP I LQ AYGLRMITT- DSGS SPA 
( 1 ) LAGNQAQGVTSGG SGNCS SGGTT YFQPVNEALRVYGLTIjVTS -DGGGTE - 

( 142 ) VAGTQAQGVTSGGSGNCRYGGTTYFQPVNEIIiQDQPGPSTTR-AL 

(226) I AGNQAQGLTSGG SGNCTTGGTTYFQP VNE AL SAYGLTLVTS SGGGGGGG 

(142) VAG 

(86 , 

(240) VAGTQAQGVTSGGSGNCRYGGTT YFQPVNE I LQ A YGLRLVLG- HARGGP S 
(127) I AGTQARGVTSGG SGNC 

(341) LAGNQAQGVTSGG SGNCS SGGTTYFQPVNEALGG YGLTLVTSDGGG PSRR 
( 127 ) LAGNQAQGLTSGGSGNCSSGGTTYFQPVNEALSAYGLTLVTSGGRGNC — 
(269) I SGNQAQGVTSGRINDC SNGGKFLYQPDRRPVARDHGRRVGQRARRARGQ 

(86) . 

(86) 

( 142 ) LAGNQAQGVTSGG SGNCRTGGTTFFQPVNPI LQAYGLRMI TTDSGS S P — 

(351) LAGNQAQGVTSGGSGNC GGTTYFQPVN L YGL LV 



69B4 (ASP) complete 
Cellulomonas gelida 
Cellulomonas flavigena 
Cellulomonas biazotea 
Cellulomonas fimi 
Cellulomonas iranensis 
Cellulomonas cellasea 
C. xylanilytica 



(389) - PAPTS CTG YARTFTGTLAAGRAAAQ PNGS YVQVNRSGTHS VCLNGPSGA 

(49 ) -PPPTGCQGYARTYQGSVSAGTSVAQPNGSYVTTG-GGTHRVCLSGPAGT 

(186) 

(276) TTCTG YARTYTGSLASRQSAVQPSGS YVTVGSSGTIRVCLDGPSGT 

(145) — 

(86) 

(289) - PARRAPAP PARA 

(144) 
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10 



15 



20 



25 



30 



40 



Oerskovia turbata 
Oerskovia jenensis 
Cm. cellulans 
Pm. citrea 
Pm. sukumoe 
69B4 (ASP) mature 
Consensus 



69B4 (ASP) complete 
- Cellulomonas gelida 
Cellulomonas flavigena 
Cellulomonas biazotea 
Cellulomonas fimi 
Cellulomonas iranensis 
Cellulomonas cellasea 
C. xylanilytica 
Oerskovia turbata 
Oerskovia jenensis 
Cm. cellulans 
Pm. citrea 
Pm . sukumoe 
69B4 (ASP) mature 
Consensus 



69B4 (ASP) complete 
Cellulomonas gelida 
Cellulomonas flavigena 
Cellulomonas biazotea 
Cellulomonas ■ fimi 
Cellulomonas iranensis 
Cellulomonas cellasea 
C. xylanilytica 
Oerskovia turbata 
Oerskovia jenensis 
Cm. cellulans 
Pm. citrea 
Pm. sukumoe 
69B4 (ASP) mature 
Consensus 



(391) RPGARAMRGPTRAASRPGRRS RSERF VRHDRGRATGCA 

(175) 

(319) VHRRPRVRLQ 

(86) 

(86) 

(190) 

(401) 

451 500 

(438) DFDLYVQRWNG S SWVTVAQSTS PGSNETIT YRGNAG YYRYWNAASGSGA 

(97) DLDLYLQKWNGYSWASVAQSTSPGATEAVTyTGTAGYYRYWHAYAGSGA 

(186) 

(322) DFDL YLQKWNG SAW 

(145) 

(86) 

(301) . 

(144) 

(429) 

(175) 

(329) 

(86) 

( 86) 

(190) 

(451) 

501 

(488) YTMGLTLP (SEQ ID NO: 6) 

(147) YTLGATTP (SEQ ID NO: 60) 

(186) (SEQ ID NO:54) 

(336) (SEQ ID NO:56) 

(145) (SEQ ID NO:58) 

(86) (SEQ ID NO: 62) 

(301) (SEQ ID NO: 64) 

(144) (SEQ ID NO: 66) 

(429) (SEQ ID NO: 68) 

(175) (SEQ ID NO: 70) 

(329) (SEQ ID NO: 72) 

(86) — (SEQ ID NO: 74) 

(86) (SEQ ID NO: 76) 

(190) (SEQ ID NO: 8) 

(501) (SEQ ID NO:647) 



45 

EXAMPLE 6 

Detection of Novel Homologues of 69B4 Protease by Immunoblotting 

In this Example, immunoblotting experiments used to detect homologues of 69B4 
are described. The following organisms were used in these experiments : 

so 1 . Cellulomonas biazotea DSM 201 1 2 

2. Cellulomonas flavigena DSM 20109 

3. Cellulomonas fimi DSM 201 1 3 

4. Cellulomonas cellasea DSM 201 1 8 

5. Cellulomonas uda DSM 201 07 
55 6. Cellulomonas gelida DSM 201 1 1 

7. Cellulomonas xylanilytica LMG 21 723 

8. Cellulomonas iranensis DSM 1 4785 

9. Oerskovia jenensis DSM 46000 

10. Oerskovia turbata DSM 20577 

eo 11. Cellulosimicrobium cellulans DSM 20424 

12. Xylanibacterium ulmi LMG21721 

13. Isoptericola variabilis DSM 10177 

14. Xylanimicrobium pachnodae DSM 12657 
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15. Promicromonospora citrea DSM 431 10 

16. Promicromonospora sukumoe DSM 44121 

17. Agromyces ramosus DSM 43045 

The strains were first grown on Heart Infusion/skim milk agar plates (72 h, 30°C) to 
confirm strain purity, protease reaction by clearing of the skim milk and to serve as 
inoculum. Bacterial strains were cultivated on Brain Heart Infusion broth supplemented with 
casein (0.8% w/v) in 100/500 Erlenmeyer flasks with baffles at 230 rpm, 30°C for 5 days. 
Microbial growth was checked by microscopy. Supernatants were separated from cells by 
centrifugation for 30 min at 4766 x g. Further solids were removed by centrifugation at 9500 
rpm. Supernatants were concentrated using Vivaspin 20 ml concentrator (Vivascience), 
cutoff 10 kDa, by centrifugation at 4000 x g. Concentrates were stored in aliquots of 0.5 mL 
at-20°C. 

Primary antibody 

The primary antibody (EP034323) for the immunoblotting reaction, prepared by 
Eurogentec (Lifege Science Park, Seraing, Belgium) was raised against 2 peptides 
consisting of amino acids 151-164 and 178-189 in the 69B4 mature protease (SEQ ID 
NO:8), namely: 

TSGGSGNCRTGGTT (epitope 1; SEQ ID NO:51) and LRMITTDSGSSP (epitope 2; 
SEQ ID NO:52) as shown below in the amino acid sequence of 69B4 mature protease: 



1 FDVIGGNAYT IGGRSRCSIG FAVNGGF I TA GHCGRTGATT ANPTGTFAGS 
51 SFPGNDYAFV RTGAGVNLLA QVNNYSGGRV QVAGHTAAPV GSAVCRSGST 
101 TGWHCGTITA LNS SVTYPEG TVRGL I RTTV CAEPGDSGGS LLAGNQAQGV 

(SEQ ID NO: 8) 



151 



GOT?FFQPVN PILQAYGgl§g2 



Electrophoresis and Immunoblotting 
Sample preparation 

1 . Concentrated culture supernatant (50 pi.) 

2. PMSF (1 ^L; 20 mg/ml) 

3. 1MHCI(25[iL) 

4. Nu PAGE LDS sample buffer (25 jliL) (Invitrogen, Carlsbad, CA, USA) 
Mixed and heated at 90°C for 10 min. 



Electrophoresis 

SDS-PAGE was performed in duplicate using NuPAGE 10% Bis-Tris gels 
(Invitrogen) with MES-SDS running buffer at 100 v for 5 min. and 200 v constant. Where 
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possible,25 juL sample were loaded in each slot. One gel of each pair was stained with 
Coomassie Blue and the other gel was used for immunoblotting using the Boehringer 
Mannheim phromogenic Western blotting protocol (Roche). 

Immunoblotting 

The transfer buffer used was Transfer buffer: Tris (0.25M) - glycine (1 .92M) - 
methanol (20% v/v). The PVDF membrane was pre-wetted by successive moistening in 
methanol, deionized water, and finally transfer buffer. 

The PAGE gel was briefly washed in deionized water and transferred to blotting pads 
soaked in transfer buffer, covered with pre-wetted PVDF membrane and pre-soaked blotting 
pads. Blotting was performed in transfer buffer at 400 mA constant for 2.5-3 h. The 
membrane was briefly washed (2x) in Tris buffered saline (TBS) (0.5M Tris, 0.1 5M NaCI, 
pH7.5). Non-specific antibody binding was prevented by incubating the membrane in 1% v/v 
mouse/rabbit Blocking Reagent (Roche) in maleic acid solution (100 mM maleic acid, 150 
mM NaCI, pH7.5) overnight at 4°C. 

The primary antibody used in these reactions was EP034323 diluted 1:1000. The 
reaction was performed with the Ab diluted in 1% Blocking Solution with a 30 min. action 
time. The membrane was washed 4x 10 min. in TBST (TSB + 0.1% v/v Tween 20). 

The secondary antibody consisted of anti-mouse/anti-rabbit IgG (Roche) 73 juL in 20 
ml in 1% Blocking Solution with a reaction time of 30 min. The membrane was washed 4x 
15 min. in TBST and the substrate reaction (alkaline phosphatase) performed with BM 
Chromogenic Western Blotting Reagent (Roche) until staining occurred. 

The results of the cross-reactivity with primary polyclonal antibody are shown in 
Table 6-1. 



Table 6-1. Immunoblotting Results 


Strain 


Immuno- 
Blot Result 


Estimated 
Molecular 

Mass 

kDa 


% Sequence 
Identity to 

69B4 Mature 
Protease 


Protease 
' Activity 
On Hi- 
Skim Milk 
Agar 


C. flavigena DSM 
20109 


positive 


21 


66 


positive 


C. biazotea DSM 
20112 


negative 




65 


positive 


C. fimi DSM 20112 


negative 




72 


weak + 


C.gelida DSM 20111 


positive 


20 


69 


weak + 
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C. ucteDSM 20107 


negative 






weak + 


C. iranensis DSM 
14785 


negative 




33 


weak + 


C. cellasea DSM 
20118 


positive 


27 


61 


positive 


C. xylanilytica LMG 
21723 


• 

negative 




o9 


positive 


O. furfcafa DSM 
20577 


positive 


lo 


"70 

73 


positive 


O. jenensis DSM 
46000 


positive 


35 


78 


positive 


C. cellulans DSM 
20424 


negative 




48 


positive 


P. c/frea DSM 43110 


negative 




28 


positive 


P. sukumoe DSM 
44121 


i icydu vt; 




RQ 


pUolll vc 


X. U/A77/LMG21721 


negative 




72 


negative 


/. variabilis DSM 
10177 


negative 






positive 


X. pachnodae DSM 
12657 


negative 






weak + 


A ramosus DSM 
43045 


negative 






weak + 



Based on these results, it is clear that the antibody used in these experiments is 
highly specific at detecting homologues with a very high percentage of amino acid sequence 
identity to 69B4 protease. Furthermore, these results indicate that the C-terminal portion of 
the 69B4 mature protease chain is fairly variable especially in the region of the 2-peptide 
epitopes. In these experiments, it was determined that in cases where there were more 
than 2 amino acid differences in this region a negative Western blotting reaction resulted. 

EXAMPLE 7 
Inverse PCR and Genome Walking 

In this Example, experiments conducted to elucidate polynucleotide sequences of 
ASP are described. The microorganisms utilized in these experiments were : 

1 . Cellulomonas biazotea DSM 201 1 2 

2. Cellulomonas flavigena DSM 201 09 

3. Cellulomonas fimi DSM 201 1 3 

4. Cellulomonas cellasea DSM 201 1 8 

5. Cellulomonas gelida DSM 201 1 1 

6. Cellulomonas iranensis (DSM 14785) 

7. Oerskovia jenensis DSM 46000 
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8. Oerskovia turbata DSM 20577 

9. Cellulosimicrobium cellulans DSM 20424 

1 0. Promicromonospora citrea DSM 431 1 0 

11. Promicromonospora sukumoe DSM 44121 

5 

These bacterial strains were cultivated on Brain Heart Infusion broth or Tryptone 
Soya broth in 100/500 Erlenmeyer flasks with baffles at 230 rpm, 30°C for 2 days. Cells 
were separated from the culture broth by centrifugation for 30 min at 4766 x g. 

Chromosomal DNA was obtained by standard phenol/chloroform extraction method 
10 known in the art from cells digested by lysozyme/EDTA (See e.g., Sambrook etal., supra). 
Chromosomal DNA was digested with the restriction enzymes selected from the following 
list: Apa\, BamH\, BssHU, Kpn\, Nati, A/col, Nhe\, Pvul, Sail or Ssfll. 

The nucleotide and amino acid sequences of these organisms are provided below. 
15 In these listings, the mature protease is indicated in bold and the signal sequence is 
underlined. 



C. flavigena (DSM 20109) 



20 


1 


GTCGAC6TCA 
CA6CT6CA6T 


TCGGGGGCAA 
AGCCCCCGTT 


CGCGTACTAC 
GCGCATGATG 


ATCGGGTCGC 
TAGCCCAGCG 


GCTCGCGGTG 
CGAGCGCCAC 


25 


51 


CTCGATCGGG 
GAGCTAGCCC 


TTCGCGGTCG 
AAGCGCCAGC 


AGGGCGGGTT 
TCCCGCCCAA 


CGTCACCGCG 
GCAGTGGCGC 


GGGCACTGCG 
CCCGTGACGC 


101 


GGCGCGCGGG 
CCGCGCGCCC 


CGCGAGCACG 
GCGCTCGTGC 


TCGTCACCGT 
AGCAGTGGCA 


CGGGGACCTT 
GCCCCTGGAA 


CCGCGGCTCG 
GGCGCCGAGC 


30 


151 


TCGTTCCCCG 
AGCAAGGGGC 


GCAACGACTA 
CGTTGCTGAT 


CGCGTGGGTC 
GCGGACCCAG 


CAGGTCGCCT 
GTCCAGCGGA 


CGGGCAACAC 
GCCCGTTGTG 




201 


GCCGCGCGGG 
CGGCGCGCCC 


CTGGTGAAGA 
GACCACTTGT 


ACCACTCGGG 
TGGTGAGCCC 


CGGCACGGTG 
GCCGTGCCAC 


CGCGTCACCG 
GCGCAGTGGC 


35 


251 


GCTCGCAGCA 
CGAGCGTCGT 


GGCCGCGGTC 
CCGGCGCCAG 


GGCTCGTACG 
CCGAGCATGC 


TGTGCCGATC 
ACACGGCTAG 


GGGCAGCACG 
CCCGTCGTGC 


40 


301 


ACGGGATGGC 
TGCCCTACCG 


GGTGCGGCTA 
CCACGCCGAT 


CGTCCGGGCG 
GCAGGCCCGC 


TACAACACGA 
ATGTTGTGCT 


CCGTGCGGTA 
GGCACGCCAT 


351 


CGCGGAGGGC 
GCGCCTCCCG 


TCGGTCTCGG 
AGCCAGAGCC 


GCCTCATCCG 
CGGAGTAGGC 


CACGAGCGTG 
GTGCTCGCAC 


TGCGCCGAGC 
ACGCGGCTCG 


45 


401 


CGGGCGACTC 
GCCCGCTGAG 


CGGCGGCTCG 
GCCGCCGAGC 


CTGGTCGCCG 
GACCAGCGGC 


GCACGCAGGC 
CGTGCGTCCG 


CCAGGGCGTC 
GGTCCCGCAG 




451 


ACGTCGGGCG 


GGTCCGGCAA 


CTGCCGCTAC 


GGGGGCACGA 


CGTACTTCCA 
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TGCAGCCCGC CCAGGCCGTT GACGGCGATG CCCCCGTGCT GCATGAAGGT 

501 GCCCGTGAAC GAGATCCTGC AGGACCAGCC CGGGCCGTCG ACCACGCGTG 
CGGGCACTTG CTCTAGGACG TCCTGGTCGG GCCCGGCAGC TGGTGCGCAC 

551 CCCTA 

GGGAT (SEQ ID NO: 53) 



Cellulomonas flavigena (DSM 20109) 

1 VDVIGGNAYY XGSRSRCSXG FAVEGGFVTA GHCGRAGAST SSFSGTFRGS 

51 SFPGNDYAWV QVASGNTPRG LVNNHSGGTV RVTGSQQAAV GSYVCRSGST 

101 TGWRCGYVRA YNTTVRYAEG SVSGLIRTSV CAEPGDSGGS LVAGTQAQGV 

151 TSGGSGNCRY GGTTYFQPVN EILQDQPGPS TTRAL (SEQ ID NO: 54) 



Cellulomonas biazotea (DSM 201 12) 

1 TAAAACAGAC GGCCAGTGAA TTTGTAATAC GACTCACTAT AGGCGAATTG 
ATTTTGTCTG CCGGTCACTT AAACATTATG CTGAGTGATA TCCGCTTAAC 

51 AATTTAGCGG CCGCGAATTC GCCCTTACCT ATAGGGCACG CGTGGTCGAC 
TTAAATCGCC GGCGCTTAAG CGGGAATGGA TATCCCGTGC GCACCAGCTG 

101 GGCCCTGGGC TGGTACGTCG ACGTCACTAC CAACACGGTC GTCGTCAACG 
CCGGGACCCG ACCATGCAGC TGCAGTGATG GTTGTGCCAG CAGCAGTTGC 

151 CCACCGCCCT CGCCGTGGCC CAGGCGACCG AGATCGTCGC CGCCGCAACG 
GGTGGCGGGA GCGGCACCGG GTCCGCTGGC TCTAGCAGCG GCGGCGTTGC 

201 GTGCCCGCCG ACGCCGTCCG GGTCGTCGAG ACCACCGAGG CGCCCCGCAC 
CACGGGCGGC TGCGGCAGGC CCAGCAGCTC TGGTGGCTCC GCGGGGCGTG 

251 GTTCATCGAC GTCATCGGCG GCAACCGTTA CCGGATCAAC AACACCTCGC 
CAAGTAGCTG CAGTAGCCGC CGTTGGCAAT GGCCTAGTTG TTGTGGAGCG 

301 GCTGCTCGGT CGGCTTCGCC GTCAGCGGCG GCTTCGTCAC CGCCGGGCAC 
CGACGAGCCA GCCGAAGCGG CAGTCGCCGC CGAAGCAGTG GCGGCCCGTG 

351 TGCGGCACGA CCGGCGCGAC CACGACGAAA CCGTCCGGCA CGTTCGCCGG 
ACGCCGTGCT GGCCGCGCTG GTGCTGCTTT GGCAGGCCGT GCAAGCGGCC 

401 CTCGTCGTTC CCCGGCAACG ACTACGCGTG GGTGCGCGTC GCGTCCGGCA 
GAGCAGCAAG GGGCCGTTGC TGATGCGCAC CCACGCGCAG CGCAGGCCGT 

451 ACACCCCGGT CGGCGCCGTG AACAACTACA GCGGCGGCAC CGTGGCCGTC 
TGTGGGGCCA GCCGCGGCAC TTGTTGATGT CGCCGCCGTG GCACCGGCAG 

501 GCCGGCTCGA CGCAGGCGAC CGTCGGTGCG TCCGTCTGCC GCTCCGGCTC 
CGGCCGAGCT GCGTCCGCTG GCAGCCACGC AGGCAGACGG CGAGGCCGAG 

551 CACCACGGGG TGGCGCTGCG GGACGATCCA GGCGTTCAAC TCCACCGTCA 
GTGGTGCCCC ACCGCGACGC CCTGCTAGGT CCGCAAGTTG AGGTGGCAGT 
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601 ACTACGCGCA GGGCAGCGTC TCCGGCCTCA TCCGCACGAA CGTGTGCGCC 
TGATGCGCGT CCCGTCGCAG AGGCCGGAGT AGGCGTGCTT GCACACGCGG 

651 GAGCCCGGCG ACTCCGGCGG CTCGCTCATC GCCGGCAACC AGGCCCAGGG 
CTCGGGCCGC TGAGGCCGCC GAGCGAGTAG CGGCCGTTGG TCCGGGTCCC 

701 CCTGACGTCC GGCGGGTCGG GCAACTGCAC CACCGGCGGG ACGACGTACT 
GGACTGCAGG CCGCCCAGCC CGTTGACGTG GTGGCCGCCC TGCTGCATGA 

751 TCCAGCCCGT CAACGAGGCG CTCTCCGCCT ACGGCCTGAC GCTCGTCACG 
AGGTC GGGCA GTTGCTCCGC GAGAGGCGGA TGCCGGACTG CGAGCAGTGC 

801 TCGTCCGGCG GCGGCGGTGG CGGCGGCACG ACCTGCACCG GGTACGCGCG 
AGCAGGCCGC CGCCGCCACC GCCGCCGTGC TGGACGTGGC CCATGCGCGC 

851 GACCTACACC GGCTCGCTCG CCTCGCGGCA GTCCGCCGTC CAGCCGTCCG 
CTGGATGTGG CCGAGCGAGC GGAGCGCCGT CAGGCGGCAG GTCGGCAGGC 

901 GCAGCTATGT GACCGTCGGG TCCAGCGGCA CCATCCGCGT CTGCCTCGAC 
CGTCGATACA CTGGCAGCCC AGGTCGCCGT GGTAGGCGCA GACGGAGCTG 

951 GGCCCGAGCG GGACGGACTT CGAC CTGTAC CTGCAGAAGT GGAACGGGTC 
CCGGGCTCGC CCTGCCTGAA GCTGGACATG GACGTCTTCA CCTTGCCCAG 

1001 CGCGTGGGC (SEQ ID NO: 55) 
GCGCACCCG 



Cellulomonas biazotea (DSM 201 12) 

1 KQTASEFVIR LT I GELNLAA ANSPLPIGHA WSTALGWYVD VTTNTWVNA 

51 TALAVAQATE IVAAATVPAD AVRWETTEA PRTFIDVIGG NRYRINNTSR 

101 CSVGFAVSGG FVTAGHCGTT GATTTKPSGT FAG S SFPGND YAWVRVASGN 

151 TPVGAVNNYS GGTVAVAGST QATVGASVCR SGSTTGWRCG TIQAFNSTVN 

201 YAQGSVSGLX RTNVCAEPGD SGGSLIAGNQ AQGLTSGGSG NCTTGGTTYF 

251 QPVNEALSAY GLTLVTS SGG GGGGGTTCTG YARTYTG SLA SRQSAVQPSG 

3 01 SYVTVGSSGT IRVCLDGPSG TDFDLYLQKW NGSAW (SEQ ID NO: 56) 



Cellulomonas fimi (DSM 201 13) 

1 GTGGACGTGA TCGGCGGCGA CGCCTACTAC ATCGGCGGCC GCAGCCGCTG 
CACCTGCACT AGCCGCCGCT GCGGATGATG TAGCCGCCGG CGTCGGCGAC 

51 TTCGATCGGG TTCGCCGTCA CCGGGGGCTT CGTGACCGCC GGGCACTGCG 
AAGCTAGCCC AAGCGGCAGT GGCCCCCGAA GCACTGGCGG CCCGTGACGC 

101 GCCGCACCGG CGCGGCCACG ACGAGCCCGT CGGGCACGTT CGCCGGCTCG 
CGGCGTGGCC GCGCCGGTGC TGCTCGGGCA GCCCGTGCAA GCGGCCGAGC 

151 AGCTTCCCGG GCAACGACTA CGCGTGGGTG CGGGTCGCGT CGGGCAACAC 
TCGAAGGGCC CGTTGCTGAT GCGCACCCAC GCCCAGCGCA GCCCGTTGTG 
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201 GCCCGTCGGC GCGGTGAACA ACTACAGCGG CGGCACGGTC GCCGTCGCCG 
CGGGCAGCCG CGCCACTTGT TGATGTCGCC GCCGTGCCAG CGGCAGCGGC 

251 GCTCGACCCA GGCCGCCGTC GGTGCGACCG TGTGCCGCTC GGGCTCCACC 
CGAGCTGGGT CCGGCGGCAG CCACGCTGGC ACACGGCGAG CCCGAGGTGG 

3 01 ACCGGCTGGC GGTGCGGCAC CATCCAGGCG TTCAACGCGA CCGTCAACTA 
TGGCCGACCG CCACGCCGTG GTAGGTCCGC AAGTTGCGCT GGCAGTTGAT 

351 CGCCGAGGGC AGCGTCTCCG GCCTCATCCG CACGAACGTG TGCGCCGAGC 
GCGGCTCCCG TCGCAGAGGC CGGAGTAGGC GTGCTTGCAC ACGCGGCTCG 

401 CCGGCGACTC GGGCGGCTCG CTCGTCGCCG GCAACCAGGC GCAGGGCATG 
GGCCGCTGAG CCCGCCGAGC GAGCAGCGGC CGTTGGTCCG CGTCCCGTAC 

451 ACGTCCGGCG GCTCCGACAA CTGC ( SEQ ID NO: 57) 
TGCAGGCCGC CGAGGCTGTT GACG 



Cellulomonas fimi (DSM 201 1 3) 

1 VDVT GGDAYY XGGRSRCSIG FAVTGGFVTA GHCGRTGAAT TSPSGTFAGS 
51 SFPGNDYAWV RVASGNTPVG AVNNYSGGTV AVAGSTQAAV GATVCRSGST 
101 TGWRCGTIQA FNATVNYAEG SVSGI*IRTNV CAEPGDSGGS LVAG (SEQ ID 
NO:58) 

Cellulomonas gelida ( DSM 201 1 1 ) 

1 CTCGCGGGCA ACCAGGCGCA GGGCGTGACG TCGGGCGGGT CGGGCAACTG 
GAGCGCCCGT TGGTCCGCGT CCCGCACTGC AGCCCGCCCA GCCCGTTGAC 

51 CTCGTCGGGC GGGACGACGT ACTTCCAGCC CGTCAACGAG GCCCTCCGGG 
GAGCAGCCCG CCCTGCTGCA TGAAGGTCGG GCAGTTGCTC CGGGAGGCCC 

101 TGTACGGGCT CACGCTCGTG ACCTCTGACG GTGGGGGCAC CGAGCCGCCG 
ACATGCCCGA GTGCGAGCAC TGGAGACTGC CACCCCCGTG GCTCGGCGGC 

151 CCGACCGGGT GCCAGGGCTA TGCGCGGACC TACCAGGGCA GCGTCTCGGC 
GGCTGGCCCA CGGTCCCGAT ACGCGCCTGG ATGGTCCCGT CGCAGAGCCG 

201 CGGGACGTCG GTCGCGCAGC CGAACGGTTC GTACGTCACG ACCGGGGGCG 
GCCCTGCAGC CAGCGCGTCG GCTTGCCAAG CATGCAGTGC TGGCCCCCGC 

251 GGACGCACCG GGTGTGCCTG AGCGGACCGG CGGGCACGGA CCTGGACCTG 
CCTGCGTGGC CCACACGGAC TCGCCTGGCC GCCCGTGCCT GGACCTGGAC 

301 TACCTGCAGA AGTGGAACGG GTACTCGTGG GCCAGCGTCG CGCAGTCGAC 
ATGGACGTCT TCACCTTGCC CATGAGCACC CGGTCGCAGC GCGTCAGCTG 

351 GTCGCCTGGT GCCACGGAGG CGGTCACGTA CACCGGGACC GCCGGCTACT 
C AGCGGAC C A CGGTGCCTCC GCCAGTGCAT GTGGCCCTGG CGGCCGATGA 

401 ACCGCTACGT GGTCCACGCG TACGCGGGTT CGGGGGCGTA CACCCTGGGG 
TGGCGATGCA CCAGGTGCGC ATGCGCCCAA GCCCCCGCAT GTGGGACCCC 
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451 GCGACGACCC CG (SEQ ID NO: 59) 
CGCTGCTGGG GC 



Cellulomonas gelida (DSM 201 1 1) 

1 LAGNQAQGVT SGGSGNCSSG GTTYFQPVNE AIiRVYGI/TLV TSDGGGTEPP 

51 PTGCQGYART YQGSVSAGTS VAQPNGSYVT TGGGTHRVCL SGPAGTDLDL 

101 YLQKWNGYSW ASVAQSTSPG ATEAVTYTGT AGYYRYWHA YAGSGAYTLG 

151 ATTP (SEQ ID NO: 60) 



Cellulomonas iranensis (DSM 14785) 

1 TTCCCCGGCA ACGACTACGC GTGGGTCCAG GTCGGGTCGG GCGACACCCC 

AAGGGGCCGT TGCTGATGCG CACCCAGGTC CAGCCCAGCC CGCTGTGGGG 

51 CCGCGGCCTG GTCAACAACT ACGCGGGCGG CACCGTGCGG GTCACCGGGT 

GGCGCCGGAC CAGTTGTTGA TGCGCCCGCC GTGGCACGCC CAGTGGCCCA 

101 CGCAGCAGGC CGCGGTCGGC GCGTACGTCT GCCGGTCGGG CAGCACGACG 

GCGTCGTCCG GCGCCAGCCG CGCATGCAGA CGGCCAGCCC GTCGTGCTGC 

151 GGCTGGCGCT GCGGCACCGT GCAGGCCTAC AACGCGTCGG TCCGCTACGC 

CCGACCGCGA CGCCGTGGCA CGTCCGGATG TTGCGCAGCC AGGCGATGCG 

201 CGAGGGCACC GTCTCGGGCC TCATCCGCAC CAACGTCTGC GCCGAGCCCG 

GCTCCCGTGG CAGAGCCCGG AGTAGGCGTG GTTGCAGACG CGGCTCGGGC 

251 GCGACTC (SEQ ID NO: 61) 
CGCTGAG 



Cellulomonas iranensis (DSM 14785) 

1 FPGNDYAWVQ VGSGDTPRGXi 
51 GWRCGTVQAY NASVRYAEGT 



VNNYAGGTVR VTGSQQAAVG AYVCRSGSTT 
VSGLIRTNVC AEPGD (SEQ ID NO: 62) 



Cellulomonas cellasea (DSM 201 1 8) 

1 GTCGGGCGGG TCCGGCAACT GCCGCTACGG GGGCACGACG TACTTCCAGC 
CAGCCCGCCC AGGCCGTTGA CGGCGATGCC CCCGTGCTGC ATGAAGGTCG 

51 CCGTGAACGA GATCCTGCAG GCCTACGGTC TGCGTCTCGT CCTGGGCTGA 
GGCACTTGCT CTAGGACGTC CGGATGCCAG ACGCAGAGCA GGACCCGACT 

101 CACGCTCGCG GCGGGCCCGG CTCGACGCGG CCGGCCCGTC GGCCCGGGTC 
GTGCGAGCGC CGCCCGGGCC GAGCTGCGCC GGCCGGGCAG CCGGGCCCAG 

151 GCCGCCTGGT ACGTCGACGT GCCGACCAAC AAGCTCGTCG TCGAGTCGGT 
CGGCGGACCA TGCAGCTGCA CGGCTGGTTG TTCGAGCAGC AGCTCAGCCA 



WO 2005/052146 



PCT/US2004/039066 



10 



152- 



201 CGGCGACACC GCGGCGGCCG CCGACGCCGT CGCCGCCGCG GGCCTGCCTG 
GCCGCTGTGG CGCCGCCGGC GGCTGCGGCA GCGGCGGCGC CCGGACGGAC 

251 CCGACGCCGT GACGCTCGCG ACCACCGAGG CGCCACGGAC GTTCGTCGAC 
GGCTGCGGCA CTGCGAGCGC TGGTGGCTCC GCGGTGCCTG CAAGCA6CT6 

301 6TCATC66C6 GCAACGCGTA CTACATCAAC GCGAGCAGCC GCTGCTCGGT 
CAGTAGCCGC CGTTGCGCAT GATGTAGTTG CGCTCGTCGG CGACGAGCCA 

351 CGGCTTCGCG GTCGAGGGCG GGTTCGTCAC CGCGGGCCAC TGCGGGCGCG 
GCCGAAGCGC CAGCTCCCGC CCAAGCAGTG GCGCCCGGTG ACGCCCGCGC 

401 CGGGCGCGAG CACGTCGTCA CCGTCGGGGA CCTTCCGCGG CTCGTCGTTC 
15 GCCCGCGCTC GTGCAGCAGT GGCAGCCCCT GGAAGGCGCC GAGCAGCAAG 

451 CCCGGCAACG ACTACGCGTG GGTCCAGGTC GCCTCGGGCA ACACGCCGCG 
GGGCCGTTGC TGATGCGCAC CCAGGTCCAG CGGAGCCCGT TGTGCGGCGC 

20 501 CGGGCTGGTG AACAACCACT CGGGCGGCAC GGTGCGCGTC ACCGGCTCGC 

GCCCGACCAC TTGTTGGTGA GCCCGCCiSTG CCACGCGCAG TGGCCGAGCG 

551 AGCAGGCCGC GGTCGGCTCG TACGTGTGCC GATCGGGCAG CACGACGGGA 
TCGTCCGGCG CCAGCCGAGC ATGCACACGG CTAGCCCGTC GTGCTGCCCT 

25 

601 TGGCGGTGCG GCTACGTCCG GGCGTACAAC ACGACCGTGC GGTACGCGGA 
ACCGCCACGC CGATGCAGGC CCGCATGTTG TGCTGGCACG CCATGCGCCT 

651 GGGCTCGGTC TCGGGCCTCA TCCGCACGAG CGTGTGCGCC GAGCCGGGCG 
30 CCCGAGCCAG AGCCCGGAGT AGGCGTGCTC GCACACGCGG CTCGGCCCGC 

701 ACTCCGGCGG CTCGCTGGTC GCCGGCACGC AGGCCCAGGG CGTCACGTCG 
TGAGGCCGCC GAGCGACCAG CGGCCGTGCG TCCGGGTCCC GCAGTGCAGC 

35 751 GGCGGGTCCG GCAACTGCCG CTACGGGGGC ACGACGTACT TCCAGCCCGT 

CCGCCCAGGC CGTTGACGGC GATGCCCCCG TGCTGCATGA AGGTCGGGCA 

801 GAACGAGATC CTGCAGGCCT ACGGTCTGCG TCTCGTCCTG GGCTGACACG 
CTTGCTCTAG GACGTCCGGA TGCCAGACGC AGAGCAGGAC CCGACTGTGC 

40 

851 CTCGCGGCGG GCCCTCCCCT GCCCGTCGCG CGCCGGCCCC ACCAGCCCGG 
GAGCGCCGCC CGGGAGGGGA CGGGCAGCGC GCGGCCGGGG TGGTCGGGCC 

901 GCCG (SEQ ID NO: 63) 
45 CGGC 



Cellulomonas cellasea (DSM 201 1 8) 

1 VGRVRQLPLR GHDVLPARER DPAGLRSASR PGLTRSRRAR LDAAGPSARV 

50 51 AAWYVDVPTN KLWESVGDT AAAADAVAAA GLPADAVTLA TTEAPRTFVD 

101 VIGGNAYYIN ASSRCSVGFA VEGGFVTAGH CGRAGASTSS PSGTFRGSSF 

151 PGNDYAWVQV ASGNTPRGLV NNHSGGTVRV TGSQQAAVGS YVCRSGSTTG 

201 WRCGYVRAYN TTVRYAEGSV SGLIRTSVCA EPGDSGGSLV AGTQAQGVTS 
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251 GGSGNCRYGG TTYFQPVNEX LQAYGLRLVL G * HARGGP S P ARRAPAP PAR 
301 A (SEQ ID NO:64) 



5 



Cellulomonas xylanilytica (LMG 21723) 

1 CGCTGCTCGA TCGGGTTCGC CGTGACGGGC GGCTTCGTGA CCGCCGGCCA 
CTGCGGACGG TCCGGCGCGA CGACGACGTC GCCGAGCGGC ACGTTCGCCG 

10 

GCGACGAGCT AGCCCAAGCG GCACTGCCCG CCGAAGCACT GGCGGCCGGT 
GACGCCTGCC AGGCCGCGCT GCTGCTGCAG CGGCTCGCCG TGCAAGCGGC 



101 GGTCCAGCTT TCCCGGCAAC GACTACGCCT GGGTCCGCGC GGCCTCGGGC 
15 AACACGCCGG TCGGTGCGGT GAACCGCTAC GACGGCAGCC GGGTGACCGT 

CCAGGTCGAA AGGGCCGTTG CTGATGCGGA CCCAGGCGCG CCGGAGCCCG 
TTGTGCGGCC AGCCACGCCA CTTGGCGATG CTGCCGTCGG CCCACTGGCA 



20 201 GGCCGGGTCC ACCGACGCGG CCGTCGGTGC CGCGGTCTGC CGGTCGGGGT 

CGACGACCGC GTGGGGCTGC GGCACGATCC AGTCCCGCGG CGCGAGCGTC 

CCGGCCCAGG TGGCTGCGCC GGCAGCCACG GCGCCAGACG GCCAGCCCCA 
GCTGCTGGCG CACCCCGACG CCGTGCTAGG TCAGGGCGCC GCGCTCGCAG 

25 

301 ACGTACGCCC AGGGCACCGT CAGCGGGCTC ATCCGCACCA ACGTGTGCGC 
CGAGCCGGGT GACTCCGGGG GGTCGCTGAT CGCGGGCACC CAGGCGCGGG 



TGCATGCGGG TCCCGTGGCA GTCGCCCGAG TAGGCGTGGT TGCACACGCG 
30 GCTCGGCCCA CTGAGGCCCC CCAGCGACTA GCGCCCGTGG GTCCGCGCCC 

401 GCGTGACGTC CGGCGGCTCC GGCAACTGC (SEQ ID NO: 65) 
CGCACTGCAG GCCGCCGAGG CCGTTGACG 



35 

Cellulomonas xylanilytica (LMG 21723) 

1 RC S I GFAVTG GFVTAGHCGR SGATTTSPSG TFAGS SFPGN DYAWVRAASG 

51 NTPVGAVNRY DGSRVTVAGS TDAAVGAAVC RSGSTTAWGC GTXQSRGASV 

101 TYAQGTVSGL IRTNVCAEPG DSGGSLIAGT QARGVTSGGS GNC (SEQ ID 

40 NO: 66) 



45 Oerskovia turbata (DSM 20577) 

1 ATGGCACGAT CATTCTGGAG GACGCTCGCC ACGGCGTGCG CCGCGACGGC 
TAG CGTGCTA GTAAGACCTC CTGCGAGCGG TGCCGCACGC GGCGCTGCCG 

51 ACTGGTTGCC GGCCCCGCAG CGCTCACCGC GAACGCCGCG ACGCCCACCC 
so TGACCAACGG CCGGGGCGTC GCGAGTGGCG CTTG CGGCGC TGCGGGTGGG 

101 CCGACACCCC GACCGTTTCA CCCCAGACCT CCTCGAAGGT CTCGCCCGAG 
GGCTGTGGGG CTGGCAAAGT GGGGTCTGGA GGAGCTTCCA GAGCGGGCTC 
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151 GTGCTCCGCG CCCTCCAGCG GGACCTGGGG CTGAGCGCCA AGGACGCGAC 
CACGAGGCGC GGGAGGTCGC CCTGGACCCC GACTCGCGGT TCCTGCGCTG 

5 201 GAAGCGTCTG GCGTTCCAGT CCGACGCGGC GAGCACCGAG GACGCTCTCG 

CTTCGCAGAC CGCAAGGTCA GGCTGCGCCG CTCGTGGCTC CTGCGAGAGC 

251 CCGACAGCCT GGACGCCTAC GCGGGCGCCT GGGTCGACCC TGCGAGGAAC 
GGCTGTCGGA CCTGCGGATG CGCCCGCGGA CCCAGCTGGG ACGCTCCTTG 

10 

301 ACCCTGTACG TCGGCGTCGC CGACAGGGCC GAGGCCAAGG AGGTCCGTTC 
TGGGACATGC AGCCGCAGCG GCTGTCCCGG CTCCGGTTCC TCCAGGCAAG 

351 GGCCGGAGCG ACCCCCGTGG TCGTCGACCA CACGCTCGCC GAGCTCGACA 
15 CCGGCCTCGC TGGGGGCACC AGCAGCTGGT GTGCGAGCGG CTCGAGCTGT 

401 CGTGGAAGGC GGCGCTCGAC GGTGAGCTCA ACGACCCCGC GGGCGTCCCG 
GCACCTTCCG CCGCGAGCTG CCACTCGAGT TGCTGGGGCG CCCGCAGGGC 

20 451 AGCTGGTTCG TCGACGTCAC GACCAACCAG GTCGTCGTCA ACGTGCACGA 

TCGACCAAGC AGCTGCAGTG CTGGTTGGTC CAGCAGCAGT TGCACGTGCT 

501 CGGCGGACGC GCCCTCGCGG AGCTGGCTGC CGCGAGCGCG GGCGTGCCCG 
GCCGCCTGCG CGGGAGCGCC TCGACCGACG GCGCTCGCGC CCGCACGGGC 

25 

551 CCGACGCCAT CACCTACGTG ACGACGACCG AGGCTCCTCG TCCCCTCGTC 
GGCTGCGGTA GTGGATGCAC TGCTGCTGGC TCCGAGGAGC AGGGGAGCAG 

601 6ACGT6GT6G GCGGCAACGC GTACACCATG GGTTCGGGCG GGCGCTGCTC 
30 CTGCACCACC CGCCGTTGCG CATGTGGTAC CCAAGCCCGC CCGCGACGAG 

651 GGTCGGCTTC GCGGTGAACG GGGGCTTCAT CACGGCCGGG CACTGCGGCT 
CCAGCCGAAG CGCCACTTGC CCCCGAAGTA GTGCCGGCCC GTGACGCCGA 

35 701 CGGTCGGCAC CCGCACCTCG GGGCCGGGCG GCACGTTCCG GGGGTCGAAC 

GCCAGCCGTG GGCGTGGAGC CCCGGCCCGC CGTGCAAGGC CCCCAGCTTG 

751 TTCCCCGGCA ACGACTACGC CTGGGTGCAG GTCGACGCGG GTAACACCCC 
AAGGGGCCGT TGCTGATGCG GACCCACGTC CAGCTGCGCC CATTGTGGGG 

40 

801 GGTCGGCGCG GTCAACAACT ACAGCGGTGG GCGCGTCGCG GTCGCAGGGT 
CCAGCCGCGC CAGTTGTTGA TGTCGCCACC CGCGCAGCGC CAGCGTCCCA 

851 CGACGGCCGC GCCCGTGGGG GCCTCGGTCT GCCGGTCCGG TTCCACGACG 
45 GCTGCCGGCG CGGGCACCCC CGGAGCCAGA CGGCCAGGCC AAGGTGCTGC 

901 GGCTGGCACT GCGGCACCAT CGGCGCGTAC AACACCTCGG TGACGTACCC 
CCGACCGTGA CGCCGTGGTA GCCGCGCATG TTGTGGAGCC ACTGCATGGG 

50 951 GCAGGGCACC GTCTCGGGGC TCATCCGCAC GAACGTGTGC GCCGAGCCCG 

CGTCCCGTGG CAGAGCCCCG AGTAGGCGTG CTTGCACACG CGGCTCGGGC 



1001 



GCGACTCGGG CGGCTCGCTC CTCGCGGGCA ACCAGGCGCA GGGCGTGACC 
CGCTGAGCCC GCCGAGCGAG GAGCGCCCGT TGGTCCGCGT CCCGCACTGG 



WO 2005/052146 



PCT7US2004/039066 



-155- 



1051 TCGGGCGGGT CGGGCAACTG CTCGTCGGGC GGGACGACGT ACTTCCAGCC 
AGCCCGCCCA GCCCGTTGAC GAGCAGCCCG CCCTGCTGCA TGAAGGTCGG 

1101 CGTCAACGAG GCCCTCGGGG GGTACGGGCT CACGCTCGTG ACCTCTGACG 
GCAGTTGCTC CGGGAGCCCC CCATGCCCGA GTGCGAGCAC TGGAGACTGC 

1151 GTGGGGGCCC GAGCCGCCGC CGACCGGGTG CCAGGGCTAT GCGCGGACCT 
CACCCCCGGG CTCGGCGGCG GCTGGCCCAC GGTCCCGATA CGCGCCTGGA 

1201 ACCAGGGCAG CGTCTCGGCC GGGACGTCGG TCGCGCAGCG AACGGTTCGT 
TGGTCCCGTC GCAGAGCCGG CCCTGCAGCC AGCGCGTCGC TTGCCAAGCA 

1251 ACGTCACGAC CGGGGGCGGG CGACCGGGTG TGCC (SEQ ID NO: 67) 
TGCAGTGCTG GCCCCCGCCC GCTGGCCCAC ACGG 



Oerskovia turbata (DSM 20577) 

1 MARSFWRTLA TACAATALVA GPAALTANAA TPTPDTPTVS PQTSSKVSPE 

51 VLRALQRDLG LSAKDATKRL AFQ SDAASTE DALAD SLDAY AGAWVDPARN 

101 TLYVGVADRA EAKEVR SAGA TPVWDHTLA ELDTWKAALD GELNDPAGVP 

151 SWFVDVTTNQ WVNVHDGGR ALAELAAASA GVPADAITYV TTTEAPRPLV 

201 DWGGNAYTM GSGGRCSVGF AVNGGFXTAG HCGSVGTRTS GPGGTFRGSN 

251 FPGNDYAWVQ VDAGNTPVGA VNNYSGGRVA VAGSTAAPVG ASVCRSGSTT 

301 GWHCGTIGAY NTSVTYPQGT VSGLIRTNVC AEPGDSGGSL LAGNQAQGVT 

351 SGGSGNCSSG GTTYFQPVNE ALGGYGLTLV TSDGGGPSRR RPGARAMRGP 

401 TRAASRPGRR SRSERFVRHD RGRATGCA (SEQ ID NO: 68) 



Oerskovia jenensis (DSM 46000) 

1 GCCGCTGCTC GGTCGGCTTC GCGGTGAACG GCGGCTTCGT CACCGCAGGC 

CGGCGACGAG CCAGCCGAAG CGCCACTTGC CGCCGAAGCA GTGGCGTCCG 

51 CACTGCGGGA CGGTGGGCAC CCGCACCTCG GGGCCGGGCG GCACGTTCCG 
GTGACGCCCT GCCACCCGTG GGCGTGGAGC CCCGGCCCGC CGTGCAAGGC 

101 CGGGTCGAGC TTCCCCGGCA ACGACTACGC CTGGGTGCAG GTCGACGCGG 
GCCCAGCTCG AAGGGGCCGT TGCTGATGCG GACCCACGTC CAGCTGCGCC 

151 GGAACACCCC GGTCGGGGCC GTCAACAACT ACAGCGGTGG ACGCGTCGCG 
CCTTGTGGGG CCAGCCCCGG CAGTTGTTGA TGTCGCCACC TGCGCAGCGC 

201 GTCGCGGGCT CGACGGCCGC ACCCGTGGGT TCCTCGGTCT GCCGGTCCGG 
CAGCGCCCGA GCTGCCGGCG TGGGCACCCA AGGAGCCAGA CGGCCAGGCC 

251 TTCCACGACG GGCTGGCGCT GCGGCACGAT CGCGGCCTAC AACAGCTCGG 
AAGGTGCTGC CCGACCGCGA CGCCGTGCTA GCGCCGGATG TTGTCGAGCC 

301 TGACGTACCC GCAGGGGACC GTCTCCGGGC TCATCCGCAC CAACGTGTGC 
ACTGCATGGG CGTCCCCTGG CAGAGGCCCG AGTAGGCGTG GTTGCACACG 

351 GCCGAGCCGG GCGACTCGGG CGGCTCGCTC CTCGCGGGCA ACCAGGCACA 
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401 GGGCCTGACG TCGGGCGGGT 
CCCGGACTGC AGCCCGCCCA 

451 ACTTCCAGCC CGTCAACGAG 
TGAAGGTCGG GCAGTTGCTC 

501 ACCTCCGGCG GCAGGGGCAA 
TGGAGGCCGC CGTCCCCGTT 



Oerskovia jenensis (DSM 46000) 

1 RCSVGFAVNG GFVTAGHCGT 

51 NTPVGAVNNY SGGRVAVAGS 

101 TYPQGTVSGL X RTNVCAEPG 

151 FQPVNEALSA YGLTLVTSGG 



-156 - 

GCCGAGCGAG GAGCGCCCGT TGGTCCGTGT 

CGGGCAACTG CTCGTCGGGC GGCACGACGT 
GCCCGTTGAC GAGCAGCCCG CCGTGCTGCA 

GCGCTCTCGG CCTACGGCCT CACGCTCGTG 
CGCGAGAGCC GGATGCCGGA GTGCGAGCAC 

CTGC (SEQ ID NO: 69) 
GACG 



VGTRTSGPGG TFRGSSFPGN DYAWVQVDAG 
TAAPVGSSVC RSGSTTGWRC GTIAAYNSSV 
DSGGSLLAGN QAQGLTSGGS GNCSSGGTTY 
RGNC (SEQ ID NO: 70) 



Cellulosimicrobium cellulans (DSM 20424) 

1 CCACGGGCGG CGGGTCGGGC AGCGCGCTCG TCGGGCTCGC GGGCAAGTGC 
GGTGCCCGCC GCCCAGCCCG TCGCGCGAGC AGCCCGAGCG CCCGTTCACG 

51 ATCGACGTCC CCGGGTCCGA CTTCAGTGAC GGCAAGCGCC TCCAGCTGTG 
TAGCTGCAGG GGCCCAGGCT GAAGTCACTG CCGTTCGCGG AGGTCGACAC 

101 GACGTGCAAC GGGTCGCAGG CAGCGCTGGA CGTTCGAAGC CGACGGCACC 
CTGCACGTTG CCCAGCGTCC GTCGCGACCT GCAAGCTTCG GCTGCCGTGG 

- 151 GTACGCGCGG GCGGCAAGTG CATGGACGTC GCGTGGGCGC CGCGGCCGAC 
CATGCGCGCC CGCCGTTCAC GTACCTGCAG CGCACCCGCG GCGCCGGCTG 

201 GGCACGGCGC TCCAGCTCGC GAACTGCACG GCAACGCGGC CCAGAAGTTC 
CCGTGCCGCG AGGTCGAGCG CTTGACGTGC CGTTGCGCCG GGTCTTCAAG 

251 GTGCTCAACG GCGCGGGCGA CCTCGTGTCG GTGCTGGCGA ACAAAGTGCG 
CACGAGTTGC CGCGCCCGCT GGAGCACAGC CACGACCGCT TGTTTCACGC 

301 TCGACGCCGC CGGGTGCGCA CCGAGGTACT CGCGGCGCCG TACGAGCTCA 
AGCTGCGGCG GCCCACGCGT GGCTCCATGA GCGCCGCGGC ATGCTCGAGT 

351 CGGCGACGTG CGCGGCGGCG ACCGCTACAT CACACGGGAC CCGGGCGCGT 
GCCGCTGCAC GCGCCGCGGC TGGCGATGTA GTGTGCCCTG GGCCCGCGCA 

401 CGTCGGGCTC GGCCTGCTCG ATCGGGTACG CCGTCCAGGG CGGCTTCGTC 
GCAGCCCGAG CCGGACGAGC TAGCCCATGC GGCAGGTCCC GCCGAAGCAG 

451 ACGGCGGGGC ACTGCGGACG CGGCGGGACA AGGAGAGTGC TCACCGCGAG 
TGCCGCCCCG TGACGCCTGC GCCGCCCTGT TCCTCTCACG AGTGGCGCTC 

501 CTGGGCGCGC ATGGGGACGG TCCAGGCGGC GTCGTTCCCC GGCCACGACT 
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6ACCCGC6CG TACCCCTGCC AGGTCCGCCG CAGCAAGGGG CCGGTGCTGA 

551 ACGCGTGGGT GCGCGTCGAC GCCGGGTTCT CCCCCGTCCC GCGGGTGAAC 
TGCGCACCCA CGCGCAGCTG CGGCCCAAGA GGGGGCAGGG CGCCCACTTG 

601 AACTACGCCG GCGGCACCGT CGACGTCGCC GGCTCGGCCG AGGCGCCCGT 
TTGATGCGGC CGCCGTGGCA GCTGCAGCGG CCGAGCCGGC TCCGCGGGCA 

651 GGGTGCGTCG GTGTGCCGCT CGGGCGCCAC GACCGGCTGG CGCTGCGGCG 
CCCACGCAGC CACACGGCGA GCCCGCGGTG CTGGCCGACC GCGACGCCGC 

701 TCATCGAGCA GAAGAACATC ACCGTCAACT ACGGCAACGG CGACGTTCCC 
AGTAGCTCGT CTTCTTGTAG TGGCAGTTGA TGCCGTTGCC GCTGCAAGGG 

751 GGCCTCGTGC GCGGCAGCGC GTGCGCGGAG GGCGGCGACT CGGGCGGGTC 
CCGGAGCACG CGCCGTCGCG CACGCGCCTC CCGCCGCTGA GCCCGCCCAG 

801 GGTGATCTCC GGCAACCAGG CGCAGGGCGT CACGTCGGGC AGGATCAACG 
CCACTAGAGG CCGTTGGTCC GCGTCCCGCA GTGCAGCCCG TCCTAGTTGC 

851 ACTGCTCGAA CGGCGGCAAG TTCCTCTACC AGCCCGATCG ACGGCCTGTC 
TGACGAGCTT GCCGCCGTTC AAGGAGATGG TCGGGCTAGC TGCCGGACAG 

.901 GCTCGTGACC ACGGGCGGCG GGTCGGGCAG CGCGCTCGTC GGGCTCGCGG 
CGAGCACTGG TGCCCGCCGC CCAGCCCGTC GCGCGAGCAG CCCGAGCGCC 

951 GCAAGTGCAT CGACGTCCCC GGGTCCGACT TCAG (SEQ ID NO: 71) 
CGTTCACGTA GCTGCAGGGG CCCAGGCTGA AGTC 



Cellulosimicrobium cellulans (DSM 20424) 

1 PRAAGRAARS SGSRASASTS PGPTSVTASA SSCGRATGRR QRWTFEADGT 

51 VRAGGKCMDV AWAPRPTARR SSSRTARQRG PEVRAQRRGR PRVGAGEQSA 

101 STPPGAHRGT RGAVRAHGDV RGGDRYITRD PGASSGSACS IGYAVQGGFV 

151 TAGHCGRGGT RRVLTASWAR MGTVQAAS F P GHDYAWVRVD AGFSPVPRVN 

201 NYAGGTVDVA GSAEAPVGAS VCRSGATTGW RCGVI EQKNI TVNYGNGDVP 

251 GLVRGSACAE GGDSGGSVIS GNQAQGVTSG R1NDCSNGGK FLYQPDRRPV 

301 ARDHGRRVGQ RARRARGQVH RRPRVRLQ (SEQ ID NO: 72) 



Promicromonospora citrea (DSM 431 10) 

1 TTCCCCGGCA ACGACTACGC GTGGGTGAAC ACGGGCACGG ACGACACCCT 
AAGGGGCCGT TGCTGATGCG CACCCACTTG TGCCCGTGCC TGCTGTGGGA 

51 CGTCGGCGCC GTGAACAACT ACAGCGGCGG CACGGTCAAC GTCGCGGGCT 
GCAGCCGCGG CACTTGTTGA TGTCGCCGCC GTGCCAGTTG CAGCGCCCGA 

101 CGACCCGTGC CGCCGTCGGC GCGACGGTCT GCCGCTCGGG CTCCACGACC 
GCTGGGCACG GCGGCAGCCG CGCTGCCAGA CGGCGAGCCC GAGGTGCTGG 

151 GGCTGGCACT GCGGCACCAT CCAGGCGCTG AACGCGTCGG TCACCTACGC 
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CCGACCGTGA CGCCGTGGTA GGTCCGCGAC TTGCGCAGCC AGTGGATGCG 



201 CGAGGGCACC GTGAGCGGCC TCATCCGCAC CAACGTGTGC GCCGAGCCCG 
GCTCCCGTGG CACTCGCCGG AGTAGGCGTG GTTGCACACG CGGCTCGGGC 

251 GCGACTC (SEQ ID NO: 73) 
CGCTGAG 



10 

Promicromonospora citrea (DSM 431 10) 

1 FPGNDYAWVN TGTDDTLVGA VNNYSGGTVN VAGSTRAAVG ATVCRSGSTT 
51 GWHCGTIQAL NASVTYAEGT VSGLIRTNVC AEPGD (SEQ ID NO: 74) 



15 

Promicromonospora sukumoe (DSM 44121) 

1 TTCCCCGGCA ACGACTACGC GTGGGTGAAC GTCGGCTCCG ACGACACCCC 
AAGGGGCCGT TGCTGATGCG CACCCACTTG CAGCCGAGGC TGCTGTGGGG 

20 

51 GATCGGTGCG GTCAACAACT ACAGCGGCGG CACCGTGAAC GTCGCGGGCT 
CTAGCCACGC CAGTTGTTGA TGTCGCCGCC GTGGCACTTG CAGCGCCCGA 



101 CGACCCAGGC CGCCGTCGGC TCCACCGTCT GCCGCTCCGG TTCCACGACC 
25 GCTGGGTCCG GCGGCAGCCG AGGTGGCAGA CGGCGAGGCC AAGGTGCTGG 

151 GGCTGGCACT GCGGCACCAT CCAGGCCTTC AACGCGTCGG TCACCTACGC 
CCGACCGTGA CGCCGTGGTA GGTCCGGAAG TTGCGCAGCC AGTGGATGCG 

30 201 CGAGGGCACC GTGTCCGGCC TGATCCGCAC CAACGTCTGC GCCGAGCCCG 

GCTCCCGTGG CACAGGCCGG ACTAGGCGTG GTTGCAGACG CGGCTCGGGC 

251 GCGACTC (SEQ ID NO: 75) 
CGCTGAG 

35 



Promicromonospora sukumoe (DSM 44121) 

1 FPGNDYAWVN VGSDDTPI GA VNNYSGGTVN VAGSTQAAVG STVCRSGSTT 
51 GWHCGTIQAF NASVTYAEGT VSGLIRTNVC AEPGD (SEQ ID NO: 76) 

40 



Xylanibacterium ulmi (LMG 21721) 

1 GCCGCTGCTC GATCGGGTTC GCCGTGACGG GCGGCTTCGT GACCGCCGGC 
45 CGGCGACGAG CTAGCCCAAG CGGCACTGCC CGCCGAAGCA CTGGCGGCCG 



51 CACTGCGGAC GGTCCGGCGC GACGACGACG TCCGCGAGCG GCACGTTCGC 
GTGACGCCTG CCAGGCCGCG CTGCTGCTGC AGGCGCTCGC CGTGCAAGCG 



50 101 CGGGTCCAGC TTTCCCGGCA ACGACTACGC CTGGGTCCGC GCGGCCTCGG 

GCCCAGGTCG AAAGGGCCGT TGCTGATGCG GACCCAGGCG CGCCGGAGCC 
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151 GAACACGCCG GTCGGTGCGG TGAACCGCTA CGACGGCAGC CGGGTGACCG 
CTTGTGCGGC CAGCCACGCC ACTTGGCGAT GCTGCCGTCG GCCCACTGGC 

201 TGGCCGGGTC CACCGACGCG GCCGTCGGTG CCGCGGTCTG CCGGTCGGGG 
ACCGGCCCAG GTGGCTGCGC CGGCAGCCAC GGCGCCAGAC GGCCAGCCCC 

251 TCGACGACCG CGTGGCGCTG CGGCACGATC CAGTCCCGCG GCGCGACGGT 
AGCTGCTGGC GCACCGCGAC GCCGTGCTAG GTCAGGGCGC CGCGCTGCCA 

301 CACGTACGCC GAGGGCACCG TCAGCGGGCT CATCCGCACC AACGTGTGCG 
GTGCATGCGG GTCCCGTGGC AGTCGCCCGA GTAGGCGTGG TTGCACACGC 

351 CCGAGCCGGG TGACTCCGGG GGGTCGCTGA TCGCGGGCAC CCAGGCGCAG 
GGCTCGGCCC ACTGAGGCCC CCCAGCGACT AGCGCCCGTG GGTCCGCGTC 

401 GGCGTGACGT CCGGCGGCTC CGGCAACTGC ( SEQ ID NO: 77) 
CCGCACTGCA GGCCGCCGAG GCCGTTGACG 



Xylanibacterium ulm'r. (LMG 21 721 ) 

1 RCS XGFAVTG GFVTAGHCGR SGATTTSASG TFAGSSFPGN DYAWVRAASG 
51 NTPVGAVNRY DGSRVTVAGS TDAAVGAAVC RSGSTTAWRC GTXQSRGATV 
101 TYAQGTVSGL IRTNVCAEPG DSGGSLIAGT QAQGVTSGGS G (SEQ ID NO: 78) 



Inverse PCR 

Inverse PCR was used to determine the full-length serine protease genes from 
chromosomal DNA of bacterial strains of the suborder Micrococcineae shown by PCR or 
immunoblotting to be novel homologues of the new Cellulomonas sp. 69B4 protease 
described herein. 

Digested DNA was purified using the PCR purification kit (Qiagen, Catalogue # 
28106), and self-ligated with T4 DNA ligase (Invitrogen) according to the manufacturers' 
instructions. Ligation mixtures were purified with the PCR purification kit (Qiagen) and a 
PCR was performed with primers selected from the following list; 



RV-1 Rest 
RV-1 Cellul 
RV-2 biaz-fimi 
RV-2 cella-f lavi 
RV-2 cellul 
RV-2 turb 
Fw-1 biaz-fimi 
Fw-1 cella-f lavi 
Fw-1 cellul 
Fw-1 turb 



5' - ACCCACGCGTAGTCGTTGCC - 3' (SEQ ID NO:79) 

5' - ACCCACGCGTAGTCGTKGCCGGGG - 3' (SEQ ID NO:80) 

5' - TCGTCGTGGTCGCGCCGG - 3' (SEQ ID NO:81) 

5* - CGACGTGCTCGCGCCCG - 3' (SEQ ID NO:82) 

5' - CGCGCCCAGCTCGCGGTG - 3' (SEQ ID NO:83) 

5' - CGGCCCCGAGGTGCGGGTGCCG - 3' (SEQ ID NO:84) 

5 1 - CAGCGTCTCCGGCCTCATCCGC - 3' (SEQ ID NO:85) 

5' - CTCGGTCTCGGGCCTCATCCGC - 3' (SEQ ID NO:86) 

5' - CGACGTTCCCGGCCTCGTGCGC - 3' (SEQ ID NO:87) 

5' - CACCGTCTCGGGGCTCATCCGC - 3' (SEQ ID NO:88) 
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Fw-2 rest 5' - AGCARCGTGTGCGCCGAGCC - 3' (SEQ ID NO:89) 

Fw-2 cellul 5' - GGCAGCGCGTGCGCGGAGGG - 3' (SEQ ID NO:90) 

Fw-1 gelida 5' - GCCGCTGCTCGATCGGGTTC - 3' (SEQ ID NO:91) 
Rv-1 gelida 5' - GCAGTTGCCGGAGCCGCCGGACGT - 3'. (SEQ ID NO:92) 

The amplified PCR products were examined by agarose gel electrophoresis (0.8% 
agarose in TBE buffer (Invitrogen)). Distinct bands in the range 1 .3 - 2.2 kbp for each 
organism were excised from the gel, purified using the Qiagen gel extraction kit and the 
sequence analyzed by BaseClear. Sequence analysis revealed that these DNA fragments 
covered some additional parts of protease gene homologues to the Cellulomonas 69B4 
protease gene. 



Genome Walking Using Rapid Amplification of Genomic Ends (RAGE) 

A genome walking methodology (RAGE) known in the art was used to determine the 
full-length serine protease genes from chromosomal DNA of bacterial strains of the 
suborder Micrococcineae shown by PCR or immunoblotting to be novel homologues of the 
new Cellulomonas sp. 69B4 protease. RAGE was performed using the Universal 
GenomeWalker™ Kit (BD Biosciences Clontech), some with modifications to the 
manufacturer's protocol (BD Biosciences user manual PT3042-1, Version # PR03300). 
Modifications to the manufacturer's protocol included addition of DMSO (3 pL) to the 
reaction mixture in 50 pL total volume due to the high GC content of the template DNA and 
use of Advantage™ - GC Genomic Polymerase Mix (BD Biosciences Clontech) for the PGR 
reactions which were performed as follows; 

PCR 1 PCR 2 

99°C - 0.05 sec 

94°C - 0.25 sec/72°C - 3.00 min 7 cycles 4 cycles 
94°C - 0.25 sec/67°C - 4.00 min 39 cycles 24 cycles 
67°C - 7.00 min 
15°C- 1.00 min 

PCR was performed with primers (Invitrogen, Paisley, UK) selected from the following list 
(listed in 5' to 3' orientation); 

RV-1 Rest ACCCACGCGTAGTCGTTGCC (SEQ ID NO:79) 

RV-1 Cellul ACCCACGCGTAGTCGTKGCCGGGG (SEQ ID NO:80) 
RV-2 biaz-fimi TCGTCGTGGTCGCGCCGG (SEQ ID NO:81) 
RV-2 cella-flavi CGACGTGCTCGCGCCCG (SEQ ID NO:82) 
RV-2 cellul CGCGCCCAGCTCGCGGTG (SEQ ID NO:83) 
RV-2 turb CGGCCCCGAGGTGCGGGTGCCG (SEQ ID NO:84) 
Fw-1 biaz-fimi CAGCGTCTCCGGCCTCATCCGC (SEQ ID NO:85) 
Fw-1 cella-flavi CTCGGTCTCGGGCCTCATCCGC (SEQ ID NO:86) 
Fw-1 cellul CGACGTTCCCGGCCTCGTGCGC (SEQ ID NO:87) 
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Fw-1 turb 
Fw-2 rest 
Fw-2 cellul 
Fw-1 gelida 
Rv-1 gelida 
Flavi FW1 
Flavi FW2 
Flavi RV1 
Flavi RV2 
Turb FW1 
Turb FW2 
Cellu RV1 
Cellu RV2 
Cellu FW1 
Cellu FW2 
Cella RV2 
Cella RV1 
Cella FW1 
Cella FW2 
NO: 106) 
Gelida RV1 
NO: 107) 
Gelida RV2 
Gelida FW1 
NO: 109) 
Gelida FW2 
Biazot RV1 
Biazot RV2 
Biazot FW1 
Biazot FW2 



C ACCGTCTCG G G G CTCATCCGC (SEQ ID NO:88) 
AGCARCGTGTGCGCCGAGCC (SEQ ID NO:89) 
GGCAGCGCGTGCGCGGAGGG (SEQ ID NO:90) 
GCCGCTGCTCGATCGGGTTC (SEQ ID NO:91) 
GCAGTTGCCGGAGCCGCCGGACGT (SEQ ID NO:92) 
TGCGCCGAGCCCGGCGACTCCGGC (SEQ ID NO:93) 
GGCACGACGTACTTCCAGCCCGTGAAC (SEQ ID NO:94) 
GACCCACGCGTAGTCGTTGCCGGGGAACGACGA (SEQ ID NO:95) 
GAAGGTCCCCGACGGTGACGACGTGCTCGCGCC (SEQ ID NO:96) 
CAGGCGCAGGGCGTGACCTCGGGCGGGTCG (SEQ ID NO:97) 
GGCGGGACGACGTACTTCCAGCCCGTCAA (SEQ ID NO:98) 
CACCCACGCGTAGTCGTGGCCGGGGAACGA (SEQ ID NO:99) 
GAAGCCGCCCTGGACGGCGTACCCGATCGAGCA (SEQ ID NO:100) 
TGCGCGGAGGGCGGCGACTCGGGCGGGTCG (SEQ ID NO:101) 
TTCCTCTACCAGCCCGTCAACCCGATCCTA (SEQ ID NO:102) 
CGCCGCGGGGACGAACCCGCCCTCGACCGCGAA (SEQ ID NO:103) 
CGCGTAGTCGTTGCCGGGGAACGACGAGCC (SEQ ID NO:104) 
G G CCTC ATCCGCACG AGGGTGTGCGCCG AG (SEQ ID NO: 105) 
ACGTCGGGCGGGTCCGGCAACTGCCGCTACGGGGGC (SEQ ID 

GAGCCCGTACACCCGGAGGGCCTCGTTGACGGGCTGGAA (SEQ ID 

CGTC ACGCCCTG CGCCTGGTTG CCCG CG AG (SEQ ID NO:108) 
TCCAGCCCGTCAACGAGGCCCTCCGGGTGTACGGGCTC (SEQ ID 



ACGTCGGTCGCGCAGCCGAACGGTTCGTACGTC (SEQ ID NO:110) 
CGTG GTCG CGCCG GTCGTG CCGCAGTGCCC (SEQ ID NO:111) 
GACGACGACCGTGTTGGTAGTGACGTCGACGTACCA (SEQ ID NO:112) 
TCCACC ACG GG GTGG CG CTGCG GG ACG ATC (SEQ ID NO:113) 
GTGTGCGCCGAGCCCGGCGACTCCGGCGGC (SEQ ID NO:1 14) 
Turb RV C-mature 

GCTCGGGCCCCCACCGTCAGAGGTCACGAGCGTGAG (SEQ ID 

NO: 115) 
Turb FW signal 

ATGGCACGATCATTCTGGAGGACGCTCGCCACGGCG (SEQ ID NO: 11 6) 
Cellu internal FW 

TGCTCGATCGGGTACGCCGTCCAGGGCGGCTTC (SEQ ID NO:117) 
Cellu internal RV 

TAGGATCGGGTTGACGGGCTGGTAGAGGAA (SEQ ID NO:118) 
Biazot Int Fw TGGTACGTCGACGTCACTACCAACACGGTCGTCGTC (SEQ ID NO:1 19) 
Biazot Int Rv 5' - GCCGCCGGAGTCGCCGGGCTCGGCGCACAC (SEQ ID NO:120) 
flavi Nterm 5' - GTSGACGTSATCGGSGGSAACGCSTACTAC (SEQ ID NO:121) 
flavi Cterm 5' - SGCSGTSGCSGGNGANGA (SEQ ID NO:122) 
fimi Nterm 5' - GTSGAYGTSATCGGCGGCGAYGCSTAC (SEQ ID NO:123) 
fimi Cterm 5' - SGASGCGTANCCCTGNCC (SEQ ID NO:124) 

The PCR products were subcloned in the pCR4-TOPO TA cloning vector (Invitrogen) 
and transformed to E.coli Top10 one-shot electrocompetent cells (Invitrogen). The 
transformants were incubated (37°C, 260 rpm, 16 hours) in 2xTY medium with 100 jj.g/ml 
ampicillin. The isolated plasmid DNA (isolated using the Qiagen Qiaprep pDNA isolation kit) 
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was sequenced by BaseClear. 

Sequence Analysis 

Full length polynucleotide sequences were assembled from PCR fragment 
sequences using the ContigExpress and AlignX programs in Vector NTI suite v. 9.0.0 
(Invitrogen) using the original polynucleotide sequence obtained in Example 4 as template 
and the ASP mature protease and ASP full-length sequence for alignment. The results for 
the polynucleotide sequences are displayed in Table 7-1 and the translated amino acid 
sequences are displayed in Table 7-2. For each of the natural bacterial strains the 
polynucleotide sequences and translated amino acid sequences for each of the homologous 
proteases are provided above. 

Table 7-1 provides comparison information between ASP protease and various other 
sequences obtained from other bacterial strains. Amino acid sequence information for Asp- 
matu re-protease homologues is available from 13 species: 

1 . Cellulomonas biazotea DSM 201 1 2 

2. Cellulomonas flavigena DSM 201 09 

3. Cellulomonas flmi DSM 201 1 3 

4. Cellulomonas cellasea DSM 201 1 8 

5. Cellulomonas gelida DSM 201 1 1 

6. Cellulomonas iranensis DSM 1 4784 

7. Cellulomonas xylanilytica LMG 2 1 723 

8. Oerskovia jenensis DSM 46000 

9. Oerskovia turbata DSM 20577 
ft Oerskovia turbata DSM 20577 

10. Celtulosimicrobium cellulans DSM 20424 

1 1. Promicromonospora citrea DSM 431 10 

12. Promicromonospora sukumoe DSM 44121 

13. Xylanibacterium ulmi LMG 21721 

Notably, the sequence from Cellulomonas gelida at 48 amino acids is too short for 
useful consensus alignment. Sequence alignment against Asp-mature for the remaining 12 
species are provided herein. To date, complete mature sequence has been determined for 
Oerskovia turbata, Cellulomonas cellasea, Cellulomonas biazotea and Celfulosimicrobium 
cellulans. However, there are some problems and sequence fidelity is not guaranteed for 
the sequence information known to the public, Cellulomonas cellasea protease is clearly 
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homologous to Asp (61 .4% identity). However, the sequencing of 10 independent PCR 
fragments of the C-terminal region all gives a stop codon at position 184, suggesting that 
there is no C-terminal prosequence. In addition, Cellulosimicrobium cellulans is a close 
relative of Cellulomonas and clearly has an Asp homologous protease. However, the 
sequence identity is low, only 47.7%. It contains an insertion of 4 amino acids at position 43- 
44 and it is uncertain where the N-terminus of the protein begins. Nonetheless, the data 
provided here clearly show that there are enzymes homologous to the ASP protease 
described herein. Thus, it is intended that the present invention encompass the ASP 
protease isolated from Cellulomonas strain 69B4, as well as other homologous proteases. 

In this Table, the nucleotide numbering is based on full-length gene of 69B4 
protease (SEQ ID NO:2), where nt 1 - 84 encode the signal peptide, nt 85 - 594 encode the 
N-terminal prosequence, nt 595 - 1 161 encode the mature 69B4 protease, and nt 1 162 - 
1485 encode the C-terminal prosequence. 



Table 7-1. Percent Identity of Homologous Polynucleotide Sequences from 
Natural Isolate Strains Compared with ASP Mature Protease Gene Sequence 


Strain 


Total 
Base Pairs 


Overlap* 


% Identity 
Overlap 
Mature Protease 


69B4 (ASP) Protease 


1485 


1-1485 




Cellulomonas flavigena 
DSM20109 


555 


595-1156 


72.3 


Cellulomonas 
biazotea DSM 20112 


627 


332-1355 


73.7 


Cellulomonas 
fimi DSM 20113 


474 


595-1068 


78.7 


Cellulomonas 
gelida DSM 20118 


462 


1018-1485 


72.2 


Cellulomonas 
iranensis DSM14784 


257 


748-1004 


75.2 


Cellulomonas 
cellasea DSM 201 1 8 


904 


294-1201 


72.7 


Cellulomonas 
xylanilytica LMG 21723 


429 


640-1068 


75.1 


Oerskovia 

turbata DSM 20577 


1284 


1-1291 


73.1 


Oerskovia 

jenensis DSM 46000 


387 


638-1158 


72.7 


Cellulosimicrobium 
cellulans DSM20424 


984 


251-1199 


63.1 
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Promicromonospora 
c/YreaDSM43110 


257 


748-1004 


75.9 


Promicromonospora 
sukumoe DSM 44121 


257 


748-1004 


77.4 


Xylanibacterium 
U//77/LMG21721 


430 


638-1068 


77.0 



The following Table (Table 7-2) provides information regarding the translated amino 
acid sequence data in natural isolate strains compared with full-length ASP. 



Table 7-2. Translated Amino Acid Sequence Data Comparisons 


Strain 


Total 
amino 
acids 


Signal 

peptide 

overlap: 
position 


N-terminal pro 

overlap: 
position 


Mature protease 

overlap: 
position 


C-terminal pro 

overlap: 
position 


69B4 (ASP) 
Protease 


495 


28 (1 — 28) 


1 70 (29 - 1 98) 


189 (199-387) 


108(388- 
495) 


Cellulomonas 

flavioena 

DSM20109 


185 






185 (199-383) 
id 68.6% 




Cellulomonas 
biazotea DSM 
20112 


335 




84 (104-198) 
id 35.8% 


189 MQQ — 3fi7\ 

id 70.4% 
complete 


62 (388-451) 
id 64.1% 


Cellulomnna <; 
fimi DSM 20113 


144 






■1 A A /i QQ _ Q>IO\ 

i*t*t \ i yy — o**£.) 
id 74.3% 




Cellulomonas 
gelida DSM 20118 


154 






48 (340 - 387) 
id 68.8% 


1 06 (388 - 495) 

« WW l WWW ^¥ V/V/ J 

id 63.9% 
complete 


Cellulomonas 

iranensls 

DSM14784 


85 






85 (250-334) 
id 65.9% 




Cellulomonas 
cellasea DSM 
20118 


301 




98 (99-198) 
id 31.0% 


189(199-387) 
id 68.3% 
complete 


13(388-400) 
id 30.8% 


Cellulomonas 
xylanllytlca LMG 
21723 


143 






143 (214-356) 
id 73.4% 




Oerskovia 

turbata DSM 20577 


428 


29 (2-30) 
id 43.3% 


171 (31-198) 
id 44.4% 


188 (201-389) 
. id 73.0% 
complete 


40 (390 - 429) 
id 10.0% 


Oerskovia 
jenensls DSM 
46000 


174 






174(214 - 334) 
id 73.6% 




Celluloslmlcrobium 

cellulans 

DSM20424 


328 




117(82-198) 
id 6% 


199 (199-387) 
id 47.7% 
complete 


12(388-399) 


Promicromonospora 
citrea DSM 43110 


85 






85 (250-334) 
id 75.3% 




Promicromonospora 
sukumoe DSM 
44121 


85 






85 (250-334) 
id 64.7% 




Xylanibacterium 
u/m/LMG21721 


141 






141 (214-354) 
id 72.3% 
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These results clearly show that bacterial strains of the suborder Micrococcineae, 
including the families Cellulomonadaceae and Promicromonosporaceae possess genes that 
are homologous with the 69B4 protease. Over the region of the mature 69B protease, the 
gene sequence identities range from about 60%-80%. The amino acid sequences of these 
homologous sequences exhibit about 45%-80% identity with the mature 69B4 protease 
protein. In contrast to the majority of streptogrisin proteases derived from members of the 
suborder Streptomycineae, these 69B4 (Asp) protease homologues from the suborder 
Micrococcineae possess six cysteine residues, which form three disulfide bridges in the 
mature 69B4 protease protein. 

Indeed, in spite of the incomplete sequences provided herein and questions 
regarding fidelity, the present invention provides essential elements of the Asp group of 
proteases and comparisons with streptogrisins. Asp is uniquely Asp is characterized, along 
with Streptogrisin C, as having 3 disulfide bridges. In the following sequence, the Asp 
amino acids are printed in bold and the fully conserved residues are underlined. The active 
site residues are marked with # and double underlined. The cysteine residues are marked 
with * and underlined. The disulfide bonds are located between C17 and C38, C95 and 
C105, and C131 andC158. 

1 5 8 17 20 25 30 32 

X D V [I, V] GjG [N, D] [X 9 ] CIS [I, V] G [F, Y] A V X GGF [I, V] TAGH! 

33 35 40 45 50 55 60 

C*G [Xs] G [X 2 ] T/V [X 4 ] GTF XGSS FPG N tfYA [F, W] V [X4] 

65 72 75 80 

[G, D] [Xd [L, P] [X 3 ] VN [N, R] [Y, H] [S, D] G [G, S] [R, T] V X V [A, T] G 

85 90 95 100 105 

[H, S] [T, Q] X A X VG [S, A] X VC*RSG [S, A] TT [G, A] W [H, R] C* G 

112 115 120 125 

[T, Y] [I, V] [X 3 ] [N, G] X [S, J] V X Y [P, A] [E, Q] G [T, S, D] V [R, S] GL 

130 131 135 137 140 

[I, V] R [T, G] [T, N, S] [V, A] C*AE [P, G] G P S # G G S [L, V] [L, V, I] [A. S] 

145 150 155 158 

G [N, T] OA [Q, R] G [V, L] TS G [G, R] [S, I] [G, N] [N, D] QZ [XJ G 

162 167 169 189 

G [X 4 ] Q P [X21] (SEQ ID NO:1 25) 
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Table 7-3 (below) indicates the positions where ASP and Streptogrisin C differ: 



Table 7-3. Positions At Which ASP and Streptogrisin C Differ 


ASP 
Position 


ASP 
Amino Acid 


ASP 
Homologs 


Streptogrisin C 
Amino Acid 


22 


A 


R? 


S 


25 


G 


G 


N 


28 


I 


V 


A 


51 


S 


N? 


T 


! 55 


N 


H? 


R 


57 


Y 


Y 


I 


65 


G 


D 


N 


74 


N 


R 


G 


76 


S 


D 


G 


i 77 


G 


G 


R 


79 


R 


T 


D 


88 


A 


A 


S 


i 122 


V 


V 


I 


I 125 


L 


L 


V 


126 


I 


V 


T 


141 


L 


V 


Y 


145 


N 


T 


S 



EXAMPLE 8 

Mass Spectrometric Sequencing of ASP Homologues 



In this Example, experiments conducted to confirm the DNA-derived sequence as 
well as verify/establish the N-terminal and C-terminal sequences of the mature ASP 
homologues are described. The microorganisms utilized in these experiments were the 
following: 

1 . Cellulomonas biazotea DSM 201 1 2 

2. Cellulomonas flavigena DSM 201 09 

3. Cellulomonas fimi DSM 201 1 3 

4. Cellulomonas cellasea DSM 201 1 8 

7. Oerskovia jenensis DSM 46000 

8. Oerskovia turbata DSM 20577 

9. Cellulosimicrobium cellulans DSM 20424 

The micropurified ASP homologues were subjected to mass spectrometry-based 
protein sequencing procedures which consisted of these major steps: micropurification, gel 
electrophoresis, in-gel proteolytic digestion, capillary liquid chromatography electrospray 
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tandem mass spectrometry (nanoLC-ESI-MS/MS), database searching of the mass 
spectrometric data, and de novo sequencing. Details of these steps are described what 
follows. As described previously in Example 6, concentrated culture sample (about 200 ml) 
was added to 500ml 1M CaCI2 and centrifuged at 14,000 rpm (model 541 5C Eppendorf) for 
5 min. The supernatant was cooled on ice and acidified with 200 ml 1 N HCI. After 5 min, 
200 ml 50% trichloroacetic acid were added and the sample was centrifuged for 4 min at 
14,000 rpm (model 541 5C Eppendorf). The supernatant was discarded and the pellet was 
washed first with water and then with 90% acetone. The pellet, after being dried in the 
speed vac, was dissolved in 2X Protein Preparation (Tris-Glycine Sample Buffer; Novex) 
buffer and diluted 1 + 1 with water before being applied to the SDS-PAGE gel. SDS-PAGE 
was run with NuPAGE MES SDS Running Buffer. SDS-PAGE gel (1 mm NuPAGE 10% 
Bis-Tris; Novex) was developed and stained using standard protocols known in the art. 
Following SDS-PAGE, bands corresponding to ASP homologues were excised and 
processed for mass spectrometric peptide sequencing using standard protocols in the art. 

Peptide mapping and sequencing was performed using capillary liquid 
chromatography electrospray tandem mass spectrometry (nanoLC-ESI-MS/MS). This 
analysis systems consisted of capillary HPLC system (model CapLC; Waters) and mass 
spectrometer (model Qtof Ultima API; Waters). Peptides were loaded on a pre-column 
(PepMap100 C18, 5um, 100A, 300um ID x 1mm; Dionex) and chromatographed on capillary 
columns (Biobasic C18 75um x 10cm; New Objectives) using a gradient from 0 to 100% 
solvent B in 45min at a flow rate of 200nL/min (generated using a static split from a pump 
flow rate of 5uL/min). Solvent A consisted of 0.1% formic acid in water; and solvent B was 
0.1% formic acid in acetonitrile. The mass spectrometer was operated with the following 
parameters: spray voltage of 3.1 kV, desolavation zone at 150C, mass spectra acquired 
from 400 to 1900 m/z, resolution of 6000 in v-mode. Tandem MS spectra were acquired in 
data dependent mode with two most intense peaks selected and fragmented with mass 
dependent collision energy (as specified by vendor) and collision gas (argon) at 2.5x10-5 
torr. 

The identities of the peptides were determined using a database search program 
(Mascot, Matrix Science) using a database containing ASP homologue DNA-obtained 
sequences. Database searches were performed with the following parameters: no enzyme 
selected, peptide error of 2.5Da, MS/MS ions error of 0.1 Da, and variable modification of 
carboxyaminomethyl cysteine). For unmatched MS/MS spectra, manual de novo sequence 
assignments were performed. For example, Figure 4 shows the sequence of N-terminal 
most tryptic peptide from C. flavigena determined from this tandem mass spectrum. In 
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Table 8-1 , the percentage of the sequence verified on the protein level for various 
homologues are reported along with N-terminal and C-terminal peptide sequences. 



Table 8-1. Mass Spec. Sequencing of ASP Homologues 


ASP 
i Homologue 


Sequence 
Verified 

% 

Trypsin, 
Chymotrypsin 
Digests 


N-terminal 

and 
C-terminal 
Sequences 
(Peptide Mass in Da) 


Cellulomonas 
cellasea 


81,81 


[IYJAWDAFAENWDWSSR (SEQ ID 
NO: 126) (2026.7) 
YGGTTYFQPVNEILQAY (SEQ ID 
NO:127)(1961.8) 


Cellulomonas 
flavigena 


70, 50 


VDVI\LGGNAYYI/L[...]R (SEQ ID 
NO: 128)(1 697.7) 


Cellulomonas 
fimi. 


21, ND 


VDVI/LGGDAY[...]R (SEQ ID NO:129) 
(1697.6) 


Notes: 

ND: not determined 

sequence not determined indicated in [..] 
sequence order not determined indicated by [ ] 
isobaric residues not distinguished indicated by l\L 



EXAMPLE 9 
Protease Production in Streptomyces lividans 
This Example describes experiments conducted to develop methods for production 
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of protease by S. lividans. Thus, a plasmid comprising a polypeptide encoding a 
polypeptide having proteolytic activity was constructed and used such vector to transform 
Streptomyces lividans host cells The methods used for this transformation are more fully 
described in US Patent No. 6,287,839 and WO 02/50245, both of which are herein 
expressly incorporated by reference. 

One plasmid developed during these experiments was designated as "pSEG69B4T." 
The construction of this plasmid made use of one pSEGCT plasmid vector (See, WO 
02/50245). A glucose isomerase ("Gl") promoter operably linked to the structural gene 
encoding the 69B4 protease was used to drive the expression of the protease. A fusion 
between the Gl-promoter and the 69B4 signal-sequence, N-terminal prosequence and 
mature sequence was constructed by fusion-PCR techniques as a Xba\-BamH\ fragment. 
The fragment was ligated into plasmid pSEGCT digested with Xbal and BamH\, resulting in 
plasmid pSEG69B4T (See, Figure 6). Although the present Specification provides specific 
expression vectors, it is contemplated that additional vectors utilizing different promoters 
and/or signal sequences combined with various prosequences of the 69B4 protease will find 
use in the present invention. 

An additional plasmid developed during the experiments was designated as 
M pSEA469B4CT" (See, Figure 7). As with the pSEG69B4T plasmid, one pSEGCT plasmid 
vector was used to construct this plasmid. To create the pSEA469B4CT, the Aspergillus 
niger (regulatory sequence) ("A4") promoter was operably linked to the structural gene 
encoding the 69B4 protease, and used to drive the expression of the protease. A fusion 
between the A4-promoter and the Cel A (from Streptomyces coelicolor) signal-sequence, 
the asp-N-terminal prosequence and the asp mature sequence was constructed by fusion- 
PCR techniques, as a Xbal-BamHl fragment. The fragment was ligated into plasmid 
pSEMGCT digested with Xbal and SamHI, resulting in plasmid pSEA469B4CT (See, 
Figure 7). The sequence of the A4 (A nigei) promoter region is: 

1 TCGAA CTTCAT GTTCGA GTTCTT GTTCAC GTAGAA GCCGGA GATGTG AGAGGT 

AGCTT GAAGTA CAAGCT CAAGAA CAAGTG CATCTT CGGCCT CTACAC TCTCCA 
61 GATCTG GAACTG CTCACC CTCGTT GGTGGT GACCTG GAGGTA AAGCAA GTGACC CTTCTG 
CTAGAC CTTGAC GAGTGG GAGCAA CCACCA CTGGAC CTCCAT TTCGTT CACTGG GAAGAC 
121 GCGGAG GTGGTA AGGAAC GGGGTT CCACGG GGAGAG AGAGAT GGCCTT GACGGT CTTGGG 
CGCCTC CACCAT TCCTTG CCCCAA GGTGCC CCTCTC TCTCTA CCGGAA CTGCCA GAACCC 
181 AAGGGG AGCTTC NGCGCG GGGGAG GATGGT CTTGAG AGAGGG GGAGCT AGTAAT GTCGTA 
TTCCCC TCGAAG NCGCGC CCCCTC CTACCA GAACTC TCTCCC CCTCGA TCATTA CAGCAT 
241 CTTGGA CAGGGA GTGCTC CTTCTC CGACGC ATCAGC CACCTC AGCGGA GATGGC ATCGTG 
GAACCT GTCCCT CACGAG GAAGAG GCTGCG TAGTCG GTGGAG TCGCCT CTACCG TAGCAC 
301 CAGAGA CAGACC 

GTCTCT GTCTGG (SEQ ID NO: 13 0) 
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In these experiments, the host Streptomyces lividans TK23 was transformed with 
either of the vectors described above using protoplast methods known in the art (See e.g., 
Hopwood, et al. s . Genetic Manipulation of Streptomyces, A Laboratory Manual , The John 
Innes Foundation, Norwich, United Kingdom [1985]). 

The transformed culture was expanded to provide two fermentation cultures. At 
various time points, samples of the fermentation broths were removed for analysis. For the 
purposes of this experiment, a skimmed milk procedure was used to confirm successful 
cloning. In these methods, 30 pi of the shake flask supernatant was spotted in punched out 
holes in skim milk agar plates and incubated at 37°C. The incubated plates were visually 
reviewed after overnight incubation for the presence of halos. For purposes of this 
experiment, the same samples were also assayed for protease activity and for molecular 
weight (SDS-PAGE). At the end of the fermentation run, full length protease was observed 
by SDS-PAGE. 

A sample of the fermentation broth was assayed as follows: 10pl of the diluted 
supernatant was taken and added to 190 pi AAPF substrate solution (cone. 1 mg/ml, in 0.1 
M Tris/0.005% TWEEN, pH 8.6). The rate of increase in absorbance at 410 nm due to 
release of p-nitroaniline was monitored (25°C). The assay results of the fermentation broth 
of 3 clones (X, Y, W) obtained using the pSEG69B4T and two clones using the 
pSEA469B4T indicated that Asp was expressed by both constructs, able XXI. Results for 
Two Clones (pSEA469B4T). Indeed, the results obtained in these experiments showed that 
the polynucleotide encoding a polypeptide having proteolytic activity was expressed in 
Streptomyces lividans, using both of these expression vectors. Although two vectors are 
described in this Example, it is contemplated that additional expression vectors using 
different promoters and/or signal sequences combined with different combinations of 69B4 
protease: + / - N terminal and C terminal prosequence in the pSEA4CT backbone (vector), 
as well as other constructs will find use in the present invention. 

EXAMPLE 10 
Protease Production in B. subtilis 
In this Example, experiments conducted to produce protease 69B4 (also referred to 
herein as "ASP," "Asp," and "ASP protease," and "Asp protease") in B. subtilis are 
described. In this Example, the transformation of plasmid pHPLT-ASP-C1-2 (See, Table 
10-1; and Figure 9), into B. subtilis is described. Transformation was performed as known 
in the art (See e.g., WO 02/14490, incorporated herein by reference. To optimize ASP 
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expression in B. subtilis a synthetic DNA sequence was produced by DNA2.0, and utilized in 
these expression experiments. The DNA sequence (synthetic ASP DNA sequence) 
provided below, with codon usage adapted for Bacillus species, encodes the wild type ASP 
precursor protein: 

ATGACACCACGAACTGTCACAAGAGCTCTGGCTGTGGCAACAGCAGCTGCTACACTCTTGGCTGGGGGTAT 

GGCAGCACAAGC TAACGAACCGGCTCCTCCAGGATCTGCATCAGCCCCTCCACGATTAGCTGAAAAACTTGA 

CCCTGACTTACTTGAAGCAATGGAACGCGATCTGGGGTTAGATGCAGAGGAAGCAGCTGCAACGTTAGCTTT 

TCAGCATGACGCAGCTGAAACGGGAGAGGCTCTTGCTGAGGAACTCGACGAAGATTTCGCGGGCACGTGGG 

TTGAAGATGATGTGCTGTATGTTGCAACCACTGATGAAGATGCTGTTGAAGAAGTCGAAGGCGAAGGAGCAA 

CTGCTGTGACTGTTGAGCATTCTCTTGCTGATTTAGAGGCGTGGAAGACGGTTTTGGATGCTGCGCTGGAGG 

GTCATGATGATGTGCCTACGTGGTACGTCGACGTGCCTACGAATTCGGTAGTCGTTGCTGTAAAGGCAGGAG 

CGCAGGATGTAGCTGCAGGACTTGTGGAAGGCGCTGATGTGCCATCAGATGCGGTCACTTTTGTAGAAACG 

GACGAAACGCCTAGAACGATG TTCGACGTAATTGGAGGCAACGCATATACTATTGGCGGCCGGTCTAGATG 

TTCTATCGGATTCGCAGTAAACGGTGGCTTCATTACTGCCGGTCACTGCGGAAGAACAGGAGCCACTACTG 

CCAATCCGACTGGCACATTTGCAGGTAGCTCGTTTCCGGGAAATGATTATGCATTCGTCCGAACAGGGGCA 

GGAGTAAATTTGCTTGCCCAAGTCAATAACTACTCGGGCGGCAGAGTCCAAGTAGCAGGACATACGGCCG 

CACCAGTTGGATCTGCTGTATGCCGCTCAGGTAGCACTACAGGTTGGCATTGCGGAACTATCACGGCGCT 

GAATTCGTCTGTCACGTATCCAGAGGGAACAGTCCGAGGACTTATCCGCACGACGGTTTGTGCCGAACCA 

GGTGATAGCGGAGGTAGCCTTTTAGCGGGAAATCAAGCCCAAGGTGTCACGTCAGGTGGTTCTGGAAATT 

GTCGGACGGGGGGAACAACATTCTTTCAACCAGTCAACCCGATTTTGCAGGCTTACGGCCTGAGAATGATT 

ACGACTGACTCTGGAAGTTCCCC TGCTCCAGCACCTACATCATGTACAGGCTACGCAAGAACGTTCACAGG 

AACCCTCGCAGCAGGAAGAGCAGCAGCTCAACCGAACGGTAGCTATGTTCAGGTCAACCGGAGCGGTACAC 

ATTCCGTCTGTCTCAATGGACCTAGCGGTGCGGACTTTGATTTGTATGTGCAGCGATGGAATGGCAGTAGCT 

GGGTAACCGTCGCTCAATCGACATCGCCGGGAAGCAATGAAACCATTACGTACCGCGGAAATGCTGGATATT 

ATCGCTACGTGGTTAACGCTGCGTCAGGATCAGGAGCTTACACAATGGGACTCACCCTCCCCTGA (SEQ ID 

NO:131) 

In the above sequence, bold indicates the DNA that encodes the mature protease, 
standard font indicates the leader sequence, and the underline indicates the N-terminal and 
C-terminal prosequences. 

Expression of the Synthetic ASP Gene 

Asp expression cassettes were constructed in the pXX-Kpnl (See, Figure 15) or 
p2JM103-DNNDPI (See, Figure 16) vectors and subsequently cloned into the pHPLT vector 
(See, Figure 17) for expression of ASP in B. subtilis. pXX-Kpnl is a pUC based vector with 
the aprE promoter (B. subtilis) driving expression, a cat gene, and a duplicate aprE promoter 
for amplification of the copy number in B. subtilis. The bla gene allows selective growth in E. 
colL The Kpnl, introduced in the ribosomal binding site, downstream of the aprE promoter 
region, together with the HindlW site enables cloning of Asp expression cassettes in pXX- 
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Kpnl. The vector p2JM103-DNNDPI contains the aprE promoter (ft subtilis) to drive 
expression of the BCE103 cellulase core (endo-cellulase from an obligatory alkaliphilic 
Bacillus: See, Shaw et a/., J. Mol. Biol., 320:303-309 [2002]), in frame with an acid labile 
linker (DDNDPI [SEQ ID NO:132]; See, Segalas etai, FEBS Lett., 371:171-175 [1995]). 
s The ASP expression cassette (BamH\ and HindlU) was fused to BCE103-DDNDPI fusion 
protein. . When secreted, ASP is cleaved of the cellulase core to turn into the mature 
protease 

pHPLT (See, Figure 17; and Solingen et al., Extremophiles 5:333-341 [2001]) 
contains the thermostable amylase LAT promoter (Plat) of Bacillus licheniformis, followed by 
10 Xba\ and Hpa\ restriction sites for cloning ASP expression constructs. The following 
sequence is that of the BCE103 cellulase core with DNNDPI acid labile linker, in this 
sequence, the bold indicates the acid-labile linker, while the standard font indicates the 
BCE103 core. 



15 




V R 


S K 


K L 


W I 


S L 


Lt F ■ 


A L 


T L 


I F 


T M 




1 


GTGAGA 


AGCAAA 


AAATTG 


TGGATC 


AGCTTG 


TTGTTT 


GCGTTA 


ACGTTA 


ATCTTT 


ACGATG 






CACTCT 


TCGTTT 


TTTAAC 


ACCTAG 


TCGAAC 


AACAAA 


CGCAAT 


TGCAAT 


TAGAAA 


TGCTAC 






A F 


S N 


M S 


A Q 


A D 


D Y 


S V 


V E 


E H 


G Q 




61 


GCGTTC 


AGCAAC 


ATGAGC 


GCGCAG 


GCTGAT 


GATTAT 


TCAGTT 


GTAGAG 


GAACAT 


GGGCAA 


20 




CGCAAG 


TCGTTG 


TACTCG 


CGCGTC 


CGACTA 


CTAATA 


AGTCAA 


CATCTC 


CTTGTA 


CCCGTT 






L S 


I S 


N G 


E L 


V N 


E R 


G E 


Q V 


Q L 


K G 




121 


CTAAGT 


ATTAGT 


AACGGT 


GAATTA 


GTCAAT 


GAACGA 


GGCGAA 


CAAGTT 


CAGTTA 


AAAGGG 






GATTCA 


TAATCA 


TTGCCA 


CTTAAT 


CAGTTA 


CTTGCT 


CCGCTT 


GTTCAA 


GTCAAT 


TTTCCC 






M S 


S H 


G L 


Q W 


Y G 


Q F 


V N 


Y E 


S M 


K W 


25 


181 


ATGAGT 


TCCCAT 


GGTTTG 


CAATGG 


TACGGT 


CAATTT 


GTAAAC 


TATGAA 


AGCATG 


AAATGG 






TACTCA 


AGGGTA 


CCAAAC 


GTTACC 


ATGCCA 


GTTAAA 


CATTTG 


ATACTT 


TCGTAC 


TTTACC 






L R 


D D 


W G 


I T 


V F 


R A 


A M 


Y T 


S S 


G G 




241 


CTAAGA 


GATGAT 


TGGGGA 


ATAACT 


GTATTC 


CGAGCA 


GCAATG 


TATACC 


TCTTCA 


GGAGGA 






GATTCT 


CTACTA 


ACCCCT 


TATTGA 


CATAAG 


GCTCGT 


CGTTAC 


ATATGG 


AGAAGT 


CGTCCT 


30 




Y I 


D D 


P S 


V K 


E K 


V K 


E T 


V E 


A A 


I D 




301 


TATATT 


GACGAT 


CCATCA 


GTAAAG 


GAAAAA 


GTAAAA 


GAGACT 


GTTGAG 


GCTGCG 


ATAGAC 






ATATAA 


CTGCTA 


GGTAGT 


CATTTC 


CTTTTT 


CATTTT 


CTCTGA 


CAACTC 


CGACGC 


TATCTG 






L G 


I Y 


V I 


I D 


W H 


I L 


S D 


N D 


P N 


I Y 




361 


CTTGGC 


ATATAT 


GTGATC 


ATTGAT 


TGGCAT 


ATCCTT 


TCAGAC 


AATGAC 


CCGAAT 


ATATAT 


35 




GAACCG 


TATATA 


CACTAG 


TAACTA 


ACCGTA 


TAGGAA 


AGTCTG 


TTACTG 


GGCTTA 


TATATA 






K E 


E A 


K D 


F F 


D E 


M S 


E L 


Y G 


D Y 


P N 




421 


AAAGAA 


GAAGCG 


AAGGAT 


TTCTTT 


GATGAA 


ATGTCA 


GAGTTG 


TATGGA 


GACTAT 


CCGAAT 






TTTCTT 


CTTCGC 


TTCCTA 


AAGAAA 


CTACTT 


TACAGT 


CTCAAC 


ATACGT 


CTGATA 


GGCTTA 






V I 


Y E 


I A 


N E 


P N 


G S 


D V 


T W 


D N 


Q I 


40 


481 


GTGATA 


TACGAA 


ATTGCA 


AATGAA 


CCGAAT 


GGTAGT 


GATGTT 


ACGTGG 


GACAAT 


CAAATA 






CACTAT 


ATGCTT 


TAACGT 


TTACTT 


GGCTTA 


CCATCA 


CTACAA 


TGCACC 


CTGTTA 


GTTTAT 






K P 


Y A 


E E 


V I 


P V 


I R 


D N 


D P 


N N 


I V 




541 


AAACCG 


TATGCA 


GAAGAA 


GTGATT 


CCGGTT 


ATTCGT 


GACAAT 


GACCCT 


AATAAC 


ATTGTT 






TTTGGC 


ATACGT 


CTTCTT 


CACTAA 


GGCCAA 


TAAGCA 


CTGTTA 


CTGGGA 


TTATTG 


TAACAA 


45 




I V 


G T 


G T 


W S 


Q D 


V H 


H A 


A D 


N Q 


L* A 




601 


ATTGTA 


GGTACA 


GGTACA 


TGGAGT 


CAGGAT 


GTCCAT 


CATGCA 


GCCGAT 


AATCAG 


CTTGCA 






TAACAT 


CCATGT 


CCATGT 


ACCTCA 


GTCCTA 


CAGGTA 


GTACGT 


CGGCTA 


TTAGTC 


GAACGT 






D P 


N V 


M Y 


A F 


H F 


Y A 


G T 


H G 


Q N 


L R 




661 


GATCCT 


AACGTC 


ATGTAT 


GCATTT 


CATTTT 


TATGCA 


GGAACA 


CATGGA 


CAAAAT 


TTACGA 


50 




CTAGGA 


TTGCAG 


TACATA 


CGTAAA 


GTAAAA 


ATACGT 


CCTTGT 


GTACCT 


GTTTTA 


AATGCT 






D Q 


V D 


Y A 


L D 


Q G 


A A 


I F 


V S 


E W 


G T 




721 


GACCAA 


GTAGAT 


TATGCA 


TTAGAT 


CAAGGA 


GCAGCG 


ATATTT 


GTTAGT 


GAATGG 


GGGACA 






CTGGTT 


CATCTA 


ATACGT 


AATCTA 


GTTCCT 


CGTCGC 


TATAAA 


CAATCA 


CTTACC 


CCCTGT 






S A 


A T 


G D 


G G 


V F 


L D 


E A 


Q V 


W I 


D F 


55 


781 


AGTGCA 


GCTACA 


GGTGAT 


GGTGGT 


GTGTTT 


TTAGAT 


GAAGCA 


CAAGTG 


TGGATT 


GACTTT 
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TCACGT 


CGATGT 


CCACTA 


CCACCA 


CACAAA 


AATCTA 


CTTCGT 


GTTCAC 


ACCTAA 


CTGAAA 




M D 


E R 


N L 


s w 


A N 


w s 


Ii T 


H K 


D E 


S S 


841 


ATGGAT 


GAAAGA 


AATTTA 


AGCTGG 


GCCAAC 


TGGTCT 


CTAACG 


CATAAG 


GATGAG 


TCATCT 




TACCTA 


CTTTCT 


TTAAAT 


TCGACC 


CGGTTG 


ACCAGA 


GATTGC 


GTATTC 


CTACTC 


AGTAGA 




A A 


L M 


P G 


A N 


P T 


G G 


W T 


E A 


E L 


S P 


901 


GCAGCG 


TTAATG 


CCAGGT 


GCAAAT 


CCAACT 


GGTGGT 


TGGACA 


GAGGCT 


GAACTA 


TCTCCA 




CGTCGC 


AATTAC 


GrGTCCA 


CGTTTA 


GGTTGA 


CCACCA 


ACCTGT 


CTCCGA 


CTTGAT 


AGAGGT 




S G 


T F 


V R 


E K 


I R 


E S 


A S 


D N 


N D 


P I 


961 


TCTGGT 


ACATTT 


GTGAGG 


GAAAAA 


ATAAGA 


GAATCA 


GCATCT 


GACAAC 


AATGAT 


CCCATA 




•AGACCA 


TGTAAA 


CACTCC 


CTTTTT 


TATTCT 


CTTAGT 


CGTAGA 


CTGTTG 


TTACTA 


GGGTAT 



(DNA; SEQ ID NO:133) and (Amino Acid; SEQ ID NO:134) 



The Asp expression cassettes were cloned in the pXX-Kpnl vector containing DNA 
encoding the wild type Asp signal peptide, or a hybrid signal peptide constructed of 5 
subtilisin AprE N-terminal signal peptide amino acids fused to the 25 Asp C-terminal signal 
peptide amino acids (MRSKKRTVTRALAVATAAATLLAGGMAAQA (SEQ ID NO:135), or a 
hybrid signal peptide constructed of 1 1 subtilisin AprE N-terminal signal peptide amino acids 
fused to the 19 asp C-terminal signal peptide amino acids 

(MRSKKLWISLLLAVATAAATLLAGGMAAQA (SEQ ID NO:136). These expression 
cassettes were also constructed" with the asp C-terminal prosequence encoding DNA in 
frame. Another expression cassette, for cloning in the p2JM103-DNNDPI vector, encodes 
the ASP N-terminal pro- and mature sequence. 

The Asp expression cassettes cloned in the pXX-Kpnl or p2JM103-DNNDPI vector 
were transformed into E.coli (Electromax DH10B, Invitrogen, Cat.No. 12033-015). The 
primers and cloning strategy used are provided in Table 10-1 . Subsequently, the 
expression cassettes were cloned from these vectors and introduced in the pHPLT 
expression vector for transformation into a B. subtilis {AaprE, AnprE, oppA, AspollE, 
degUHy32, AamyE::(xylR,pxylA-comK) strain. The primers and cloning strategy for ASP 
expression cassettes cloning in pHPLT are provided in Table 10-2. Transformation to B. 
subtilis was performed as described in WO 02/14490, incorporated herein by reference. 
Figures 12-21 provide plasmid maps for various plasmids described herein. 



Table 10-1. ASP in pXX-Kpnl and p2JM103-DNNDPI 



Vector 
Construct 


Signal 
Peptide 


ASP C- 
Terminal 
prosequence 


Primers 


DNA 
Template 


Host 
vector 


Restriction 
Sites Used 
for 
Cloning 


pXX-ASP- 
1 


ASP 


In frame 


dXX-ASP-NI/IV-Fw 
CTAGCTAGGTACCATGACA 
CCACGAACTGTCACAAGAG 
CT (SEQ IDNO:137) 

ASP-svntc-ProC-RV 

GTGTGCAAGCTTTCAGGG 

GAGGGTGAGTCCCATTGT 


ASP 
synthetic 
gene 
G00222 


pXX-Kpnl 


Kpn\ and 
HindtU 
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GTAA (SEQ ID NO:138) 








pXX- 
ASP-2 


ASP 


not 

incorporated 


dXX-ASP-III/IV-Fw 
CTAGCTAGGTACCATGACA 
CCACGAACTGTCACAAGAG 
CT (SEQIDNO:139) 

ASP-syntc-mature-RV 

GTGTGCAAGCTTTCAAGGG 
GAACTTCCAGAGTCAGTC 
(SEQ ID NO :140) 


ASP 
synthetic 
gene 
G00222 


pXX-Kpnl 


Kpn\ and 
HindlU 


pXX- 
ASP-3 


MRSKK 
RTVTR 
ALAVA 
TAAATL 
LAGGM 
AAQA 
(SEQ ID 
NO: 135) 


In frame 


ASP-PreCross-l-FW 

TCATGCAGGGTACCATGAG 
AAGCAAGAAGCGAACTGTC 
ACAAGAGCTCTGGCT 
(SEQ ID NO:141) 

ASP-syntc-ProC-RV 

GTGTGCAAGCTTTCAGGG 
GAGGGTGAGTCCCATTGT 
GTAA (SEQIDNO:142) 


ASP 

synthetic 

gene 

G00222 


nYY-Knnl 


r\fjin dflU 

Hindlll 


pXX- 
ASP-4 


MRSKK 

RTVTR 

ALAVA 

TAAATL 

LAGGM 

AAQA 

(SEQ ID 

NO:135) 


not 

incorporated 


#mwr r ICvl Woo— |-r Vv 

TCATGCAGGGTACCATGAG 
AAGCAAGAAGCGAACTGTC 
ACAAGAGCTCTGGCT 
(SEQ ID NO:143) 

ASP-syntc-mature-RV 

GTGTGCAAGCTTTCAAGGG 

GAACTTCCAGAGTCAGTC 

(SEQIDNO:144) 


ASP 
nor 

synthetic 

gene 

G00222 


nYY-Knnl 


r\/Jin diiu 

HindlU 


pXX- 
ASP-5 


MRSKK 

LWISLL 

LAVAT 

AAATLL 

AGGMA 

AQA 

(SEQ ID 

NO:136) 


In frame 


ASP-PreCross-ll-FW 

TCATGCAGGGTACCATGAG 
AAGCAAGAAGTTGTGGATC 
AGTTTGCTGCTGGCTGTGG 
CAACAGCAGCTGCTACA 
(SEQ IDNO:145) 

ASP-syntc-ProC-RV 

GTGTGCAAGCTTTCAGGG 
GAGGGTGAGTCCCATTGT 
GTAA (SEQ ID NO :146) 


ASP 

oyi in iciiu 

gene 
G00222 


pXX-Kpnl 


Kpn\ and 
ninox 1 1 


pXX- 
ASP-6 


MRSKK 
LWISLL 
LAVAT 
AAATLL 
AGGMA 
AQA 
(SEQ ID 
NO: 136) 


not 

incorporated 


ASP-PreCross-ll-FW 

TCATGCAGGGTACCATGAG 
AAGCAAGAAGTTGTGGATC 
AGTTTGCTGCTGGCTGTGG 
CAACAGCAGCTGCTACA 
(SEQ ID NO:147) 

ASP-syntc-mature-RV 

GTGTGCAAGCTTTCAAGGG 

GAACTTCCAGAGTCAGTC 

(SEQIDNO:148) 


ASP 

ewnth PtlR 

Oyl I LI Iwllu 

gene 
G00222 


pXX-Kpnl 


Kpn\ and 

nil fun i 


p2JM-103 
ASP 


BCE103 

cellulas 

e core + 

acid 

labile 

linker 


not 

incorporated 


DPI-ASP-svntc-ProN-FW 
CCATACCGGATCCAAACGA 
ACCGGCTCCTCCAGGATCT 
(SEQ ID NO:149) 

DPI-ASP-svntc-Mature-RV 
CTCGAGTTAAGCI I I IAAG 


ASP 
synthetic 
gene 
G00222 


P2JM103- 
DNNDPI 


BamHl and 
HindlU 
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GGGAACTTCCAGAGTCAGT 
C(SEQ ID NO:150) 









Table 10-2. ASP Expression Cassettes in pHPLT 


Vector 
construct 


Primers 


DNA 
template 


Host vector 


Restriction 
sites used 
for cloning 


pHPLT-ASP 
-III 


ASP-III&IV-FW 

TGAGCTGCTAGCAAAAGGAGAGGGTA 
AAGAATGACACCACGAACTGTC (SEQ 
IDNO:151) 

DHPLT-ASPoroC-RV 
CGTACATCCCGGGTCAGGGGAGGGTG 
AGTCCCATTG (SEQ ID NO :152) 


pXX-ASP-1 


pHPLT (Xba\ x 
Hpa\) 


Nhelx Sma\ 


pHPLT-ASP 
-IV 


ASP-III&IV-FW 

TGAGCTGCTAGCAAAAGGAGAGGGTA 
AAGAATGACACCACGAACTGTC (SEQ 
IDNO:153) 

pHPLT-ASPmat-RV 

CATGCATCCCGGGTTAAGGGGAACTT 
CCAGAGTCAGTC (SEQ ID NO:154) 


pXX-ASP-2 


pHPLT (Xba\ x 
tfpal) 


Nhe\ x Smal 


n UDI T ACD 

pnrL 1 *Aor 

-C1-1 


ASP-Cross-1 &2-FW 

TGAGCTGCTAGCAAAAGGAGAGGGTA 
AAGAATGAGAAGCAAGAAG (SEQ ID 
NO:155) 

DHPLT-ASPDroC-RV 
CGTACATCCCGGGTCAGGGGAGGGTG 
AGTCCCATTG (SEQ ID NO: 156) 


pXX-ASP-3 


pHPLT (Xfcal x 
Hpal) 


Nhe\ x Sma\ 


pHPLT-ASP 
-C1-2 


ASP-Cross-1&2-FW 

TGAGCTGCTAGCAAAAGGAGAGGGTA 
AAGAATGAGAAGCAAGAAG (SEQ ID 
NO:157) 

DHPLT-ASPmat-RV 

CATGCATCCCGGGTTAAGGGGAACTT 
CCAGAGTCAGTC (SEQ ID NO:158) 


pXX-ASP-4 


pHPLT (Xdal x 
Hpal) 


Nhe\ x S/nal 


pHPLT-ASP 
-C2-1 


ASP-Cross-1&2-FW 

TGAGCTGCTAGCAAAAGGAGAGGGTA 
AAGAATGAGAAGCAAGAAG (SEQ ID 
NO:159) 

DHPLT-ASPDroC-RV 
CGTACATCCCGGGTCAGGGGAGGGTG 
AGTCCCATTG (SEQ ID NO:160) 


pXX-ASP-5 


pHPLT (Xba\ x 
Wpal) 


Nhe\ x Smal 


pHPLT-ASP 
-C2-2 


ASP-Cross-1&2-FW 

TGAGCTGCTAGCAAAAGGAGAGGGTA 


pXX-ASP-6 


pHPLT (Xba\ x 
Hpal) 


A//7el x Sma\ 
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AAGAATGAGAAGCAAGAAG (SEQ ID 
NO:161) 

oHPLT-ASPmat-RV 

CATGCATCCCGGGTTAAGGGGAACTT 
CCAGAGTCAGTC (SEQ ID NO: 162) 








pHPLT-ASP 
-VII 


DHPLT-BCE/ASP-FW 
TGCAGTCTGCTAGCAAAAGGAGAGGG 
TAAAGAGTGAGAAG (SEQ ID NO:163) 

oHPLT-ASPmat-RV 

CATGCATCCCGGGTTAAGGGGAACTT 
CCAGAGTCAGTC (SEQ ID NO:164) 


p2JM103- 
ASP 


pHPLT 


Nhe\ x Sma\ 



Primers were obtained from MWG and Invitrogen. Invitrogen Platinum Taq DNA 
polymerase High Fidelity (Cat.No. 1 1304-029) was used for PCR amplification (0.2 pM 
primers, 25 up to 30 cycles) according to the Invitrogen's protocol. Ligase reactions of ASP 
expression cassettes and host vectors were completed by using Invitrogen T4 DNA Ligase 
(Cat. No. 15224-025), utilizing Invitrogen's protocol as recommended for general cloning of 
cohesive ends). 

Selective growth of £?. subtilis (LaprE, AnprE, oppA, AspollE, degUHy32, 
AamyE::(xylR,pxylA-comK) transformants harboring the p2JM103-ASP vector or one of the 
pHPLT-ASP vectors was performed in shake flasks containing 25 ml Synthetic Maxatase 
Medium (SMM), with 0.97 g/l CaCI 2 .6H 2 0 instead of 0.5 g/l CaCI 2 (See, U.S. Pat. No. 
5,324,653, herein incorporated by reference) with either 25 mg/L chloramphenicol or 20 
mg/L neomycin. This growth resulted in the production of secreted ASP protease with 
proteolytic activity. However. Gel analysis was performed using NuPage Novex 10% Bis- 
Tris gels (Invitrogen, Cat.No. NP0301 BOX). To prepare samples for analysis, 2 volumes of 
supernatant were mixed with 1 volume 1M HCI, 1 volume 4xLDS sample buffer (Invitrogen, 
Cat.No. NP0007), and 1% PMSF (20 mg/ml) and subsequently heated for 10 minutes at 
70°C. Then, 25 |jL of each sample was loaded onto the gel, together with 10 pL of SeeBlue 
plus 2 pre-stained protein standards (Invitrogen, Cat.No.LC5925). The results 4 clearly 
demonstrated that all asp cloning strategies described in this Example yield sufficient 
amounts of active Asp produced by B. subtilis. 

In addition, samples of the same fermentation broths were assayed as follows: 10|jl 
of the diluted supernatant was taken and added to 190 pi AAPF substrate solution (cone. 1 
mg/ml, in 0.1 M Tris/0.005% TWEEN®, pH 8.6). The rate of increase in absorbance at 410 
nm due to release of p-nitroaniline was monitored (25°C), as it provides a measure of the 
ASP concentration produced. These results indicated that all of the constructs resulted in 
the production of measurable ASP protease. 

The impact of the synthetic asp gene was investigated in Bacillus subtilis comparing 
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the expression levels of the pHPLT-ASP-c-1-2 construct with the synthetic and native asp 
gene in a B. subtilis (AaprE, AnprE, oppA, AspollE, degUHy32, AamyE::(xylR,pxylA~comK) 
strain. The native gene was amplified from plasmid containing the native asp gene, using 
platinum pfx polymerase (Invitrogen) with the following primers: 

AK04-12.1: Nhel thru RBS 

TTATGCGAGGCTAGCAAAAGGAGAGGGTAAAGAGTGAGAAGCAAAAAACG (SEQ ID 
NO:165) 

AK04-1 1 : RBS thru 5 aa aprE for ASP native C1 fusion in pHPLT 
taaagagtgagaagcaaaaaacgcacagtcacgcgggccctg (SEQ ID NO:166) 

AK04-13: Hpal 3' of native ASP mature 
gtcctctgttaacttacgggctgctgcccgagtcc (SEQ ID NO:167) 

The following conditions were used for these PCRs: 94°C for 2 min.; followed by 25 cycles 
of 94°C for 45 sec, 60°C for 30 sec, and 68*C for 2 min. for 30 sec; followed by 68°C for 5 
min. The resulting PCR product was run on an E-gel (Invitrogen), excised, and purified with 
a gel extraction kit (Qiagen). Ligase reaction of this fragment containing the native ASP with 
the pHPLT vector was completed by using ligated (T4 DNA Ligase, NEB) and transformed 
directly into S. subtilis (AaprE, AnprE, oppA, AspollE, degUHy32, AamyE::(xylR,pxylA- 
comK). Transformation to B. subtilis was performed as described in WO 02/14490 A2, 
herein incorporated by reference. 

The Asp protein was produced by growth in shake flasks at 37°C in medium 
containing the following ingredients; 0.03 g/L MgS04, 0.22 g/L K2HP04, 21.3 g/L 
NA2HP04*7H20, 6.1 g/L IMaH2P04*H20, 3.6 g/L Urea, 7 g/L soymeal, 70 g/L Maltrin 
M150, and 42 g/L glucose, with a final pH7.5. In these experiments, the production level of 
the host carrying the synthetic gene cassette was found to be 3-fold higher than the host 
carrying the native gene cassette. 

In additional experiments, expression of ASP was investigated in Bacillus subtilis 
using the sacB promoter and aprE signal peptide. The gene was amplified from plasmid 
containing the synthetic asp gene using TGO polymerase (Roche) and the primers: 

CF 520 (+) Fuse ASP (pro) to aprE ss 

GCAACATGTCTGCGCAGGCTAACGAACCGGCTCCTCCAGGA (SEQ ID NO:168) 

CF 525 (-) End of Asp gene Hindlll G ACATG ACATAAGCTTAAGG G GAACTTCCAG AGTC 
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(SEQIDNO:169) 

The sacB promoter (Bacillus subtilis), the start of the messenger RNA (+1) from 
aprE, and the aprE signal peptide were amplified from the plasmid pJHsacBJ2 using TGO 
polymerase (Roche) and the primers: 

CF 161 (+)EcoRI at start of sacB promoter 
GAGCCGAATTCATATACCTGCCGTT (SEQ ID NO:170) 

CF 521 (-) Reverse complement of CF 520 

TCCTGGAGGAGCCGGTTCGTTAGCCTGCGCAGACATGTTGC (SEQ ID NO:171) 

The following PCR conditions were used to amplify both pieces: 
94°C for 2 min. ; followed by 30 cycles of 94°C for 30 sec, 50°C for 1 min., and 66°C for 1 
min. ; followed by 72°C for 7 min. The resulting PCR products were run on an E-gel 
(Invitrogen), excised, and purified with a gel extraction kit (Qiagen). 

In addition, a PCR overlap extension fusion (Ho, Gene, 15:51-59 [1989]) was used to 
fuse the above gene fragment to the sacB promoter-aprE signal peptide fragment with PFX 
polymerase (Invitrogen) using the following primers: 

CF 161 (+)EcoR\ at start of sacB promoter 
GAGCCGAATTCATATACCTGCCGTT (SEQ ID NO:170) 

CF 525 (-) End of Asp gene H/ndlll G AC ATG AC ATAAG CTTA AG G G G AACTTCC AG AGTC 
(SEQ ID NO: 169) 

The following conditions were used for these PCRs: 
94°C for 2 min.; followed by 25 cycles of 94°C for 45 sec, 60°C for 30 sec, and 68°C for 2 
min. 30 sec; followed 68°C for 5 min. The resulting PCR fusion products were run on an E- 
gel (Invitrogen), excised, and purified with a gel extraction kit (Qiagen). The purified fusions 
were cut (EcoRI/H/ndlll) and ligated (T4 DNA Ligase, NEB) into an EctiR\/Hind\\\ pJH101 
(Ferrari et al., J. BacterioL, 152:809-814 [1983]) vector containing a strong transcriptional 
terminator. The ligation mixture was transformed into competent E. coli cells (Top 10 
chemically competent cells, Invitrogen) and plasmid preps were done to retrieve the plasmid 
(Qiagen spin-prep). 

The plasmid, pJHsacB-ASP (1-96 sacB promoter; 97-395 aprE +1 through end of 
aprE ss; and 396-1472 pro+mature asp; See, sequence provided below) was transformed to 
B. subtilis . Transformation to B. subtilis (AaprE, AnprE, oppA, AspollE, degUHy32, 



WO 2005/052146 



PCT/US2004/039066 



-179- 

AamyE::(xylR,pxylA-comK) strain was performed as described in WO 02/14490 A2, herein 
incorporated by reference. The chromosomal DNA was extracted from an overnight culture 
of the strain (grown in LB media) then transformed to strain BG 3594 and named "CF 202." 
This strain produced a clear halo on the indicator plate (LA + 1.6% skim milk). 

pJHsacB-ASP Sequence: 



CATCACATATACCTGCCGTTCACTATTATTTAGTGAAATGAGATATTATGATATTTTCTG 

AATTGTGATTAAAAAGGCAACTTTATGCCCATGCAACAGAAACTATAAAAAATACAGAGA 

ATGAAAAGAAACAGATAGATTTTTTAGTTCTTTAGGCCCGTAGTCTGCAAATCCTTTTAT 

GATTTTCTATCAAACAAAAGAGGAAAATAGACCAGTTGCAATCCAAACGAGAGTCTAAT 

AGAATGAGGTCacaGAATAGTCTTTTAAGTAAGTCTACTCTGAATTTTTTTAAAAGGAGA 

GGGTAAAGAgtgAGAAGCAAAAAATTGTGGATCAGCTTGTTGTTTGCGTTAACGTTAATC 

TTTACGATGGCGTTCAGCAACATGTCTGCGCAGGCTaacgaaccggctcctccaggatctgcatcag 

cccctccacgattagctgaaaaacttgaccctgacttacttgaagcaatggaacgcgatctggggttagatgcagaggaagca 

gctgcaacgttagcttttcagcatgacgcagctgaaacgggagaggctcttgctgaggaactcgacgaagatttcgcgggcac 

gtgggttgaagatgatgtgctgtatgttgcaaccactgatgaagatgctgttgaagaagtcgaaggcgaaggagcaactgctgt 

gactgttgagcattctcttgctgatttagaggcgtggaagacggttttggatgctgcgctggagggtcatgatgatgtgcctacgtg 

gtacgtcgacgtgcctacgaattcggtagtcgttgctgtaaaggcaggagcgcaggatgtagctgcaggacttgtggaaggcg 

ctgatgtgccatcagatgcggtcacttttgtagaaacggacgaaacgcctagaacgatgttcgacgtaattggaggcaacgcat 

atactattggcggccggtctagatgttctatcggattcgcagtaaacggtggcttcattactgccggtcactgcggaagaacagg 

agccactactgccaatccgactggcacatttgcaggtagctcgtttccgggaaatgattatgcattcgtccgaacaggggcagg 

agtaaatttgcttgcccaagtcaataactactcgggcggcagagtccaagtagcaggacatacggccgcaccagttggatctg 

ctgtatgccgctcaggtagcactacaggttggcattgcggaactatcacggcgctgaattcgtctgtcacgtatccagagggaac 

agtccgaggacttatccgcacgacggtttgtgccgaaccaggtgatagcggaggtagccttttagcgggaaatcaagcccaag 

gtgtcacgtcaggtggttctggaaattgtcggacggggggaacaacattctttcaaccagtcaacccgattttgcaggcttacggc 

ctgag aatga ttacgac tgactctggaagttcccctTAAGCTTAAAAAACCGGCCTTGGCCCCGCCGGTT 

TTTTATT Al l l l I CTTCCTCCGCATGTTCAATCCGCTCCATAATCGACGGATGGCTCCCT 

CTGAAAATTTTAACGAGAAACGGCGGGTTGACCCGGCTCAGTCCCGTAACGGCCAAGT 

CCTGAAA CGTCT CAATCGCCGCTTCCCGGTTTCCGGTCAGCTCAATGCCGTAACGGTC 

GGCGGCGTTTTCCTGATACCGGGAGACGGCATTCGTAATCGGATCCCGGACGCATCG 

TGGCCGGCATCACCGGCGCCACAGGTGCGGTTGCTGGCGCCTATATCGCCGACATCA 

CCGATGGGGAAGATCGGGCTCGCCACTTCGGGCTCATGAGCGCTTGTTTCGGCGTGG 

GTATGGTGGCAGGCCCCGTGGCCGGGGGACTGTTGGGCGCCATCTCCTTGCATGCAC 

CATTCCTTGCGGCGGCGGTGCTCAACGGCCTCAACCTACTACTGGGCTGCTTCCTAAT 

GCAGGAGTCGCATAAGGGAGAGCGTCGACCGATGCCCTTGAGAGCCTTCAACCCAGT 

CAGCTCCTTCCGGTGGGCGCGGGGCATGACTATCGTCGCCGCACTTATGACTGTCTTC 

TTTATCATGCAACTCGTAGGACAGGTGCCGGCAGCGCTCTGGGTCATTTTCGGCGAGG 

ACCGCTTTCGCTGGAGCGCGACGATGATCGGCCTGTCGCTTGCGGTATTCGGAATCTT 

GCACGCCCTCGCTCAAGCCTTCGTCACTGGTCCCGCCACCAAACGTTTCGGCGAGAA 

GCAGGCCATTATCGCCGGCATGGCGGCCGACGCGCTGGGCTACGTCTTGCTGGCGTT 

CGCGACGCGAGGCTGGATGGCCTTCCCCATTATGATTCTTCTCGCTTCCGGCGGCATC 

GGGATGCCCGCGTTGCAGGCCATGCTGTCCAGGCAGGTAGATGACGACCATCAGGGA 

CAGCTTCAAGGATCGCTCGCGGCTCTTACCAGCCTAACTTCGATCACTGGACCGCTGA 

TCGTCACGGCGATTTATGCCGCCTCGGCGAGCACATGGAACGGGTTGGCATGGATTG 

AGGCGCCGCCCTATACCTTATTTATGTTACAGTAATATTGACTTTTAAAAAAGGATTGAT 

TCTAATGAAGAAAGCAGACAAGTAAGCCTCCTAAATTCACTTTAGATAAAAATTTAGGAG 

GCATATCAAATGAACTTTAATAAAAT TGATT TAGACAATTGGAAGAGAAAAGAGATATTT 

AATCATTATTTGAACCAACAAACGACTTTTAGTATAACCACAGAAATTGATATTAGTGTTT 
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TATACCGAAACATAAAACAAGAAGGATATAAATTTTACCCTGCATTTATTTTCTTAGTGA 
CAAGGGTGATAAACTCAAATACAGCTTTTAGAAC TGGT TACAATAGCGACGGAGAGTTA 
GGTTATTGGGATAAGTTAGAGCCACTTTATACAATTTTTGATGGTGTATCTAAAACATTC 
TCTGGTATTTGGACTCCTGTAAAGAATGACTTCAAAGAGTTTTATGATTTATACCTTTCT 
s GATGTAGAGAAATATAATGGTTCGGGGAAATTGTTTCCCAAAACACCTATACCTGAAAA 
TGCTTTTTCTCTTTCTATTATTCCATG G ACTTCATTTACTGG GTTTAACTTAAATATCAAT 
AATAATAGTAATTACCTTCTACCCATTATTACAGCAGGAAAATTCATTAATAAAGGTAATT 
CAATATATTTACCGCTATCTTTACAGGTACATCATTCTGTTTGTGATGGTTATCATGCAG 
GATTGTTTATGAACTCTATTCAGGAATTGTCAGATAGGCCTAATGACTGGCTTTTATAAT 

10 ATGAGATAATGCCGACTGTACTTTTTACAGTCGGTTTTCTAATGTCACTAACCTGCCCC 
GTTAGTTGAAGAAGGTTTTTATATTACAGCTCCAGATCCTGCCTCGCGCGTTTCGGTGA 
TGACGGTGAAAACCTCTGACACATGCAGCTCCCGGAGACGGTCACAGCTTGTCTGTAA 
GCGGATGCCGGGAGCAGACAAGCCCGTCAGGGCGCGTCAGCGGGTGTTGGCGGGTG 
TCGGGGCGCAGCCATGACCCAGTCACGTAGCGATAGCGGAGTGTATACTGGCTTAAC 

1 s TATGCGGCATCAG AGCAG ATTGTACTGAG AGTGCACCATATGCGGTGTGAAATACCGC 
ACAGATGCGTAAGGAGAAAATACCGCATCAGGCGCTCTTCCGCTTCCTCGCTCACTGA 
CTCGCTGCGCTCGGTCGTTCGGCTGCGGCGAGCGGTATCAGCTCACTCAAAGGCGGT 
AATACGGTTATCCACAGAATCAGGGGATAACGCAGGAAAGAACATGTGAGCAAAAGGC 
CAGCAAAAGGCCAGGAACCGTAAAAAGGCCGCGTTGCTGGCGTTTTTCCATAGGCTCC 

20 GCCCCCCTGACGAGCATCACAAAAATCGACGCTCAAGTCAGAGGTGGCGAAACCCGA 
CAGGACTATAAAGATACCAGGCGTTTCCCCCTGGAAGCTCCCTCGTGCGCTCTCCTGT 
TCCGACCCTGCCGCTTACCGGATACCTGTGCGCCTTTCTCCCTTCGGGAAGCGTGGCG 
CTTTCTCATAGCTCACGCTGTAGGTATCTCAGTTCGGTGTAGGTCGTTCGCTGCAAGCT 
GGGCTGTGTGCACGAACCCCCCGTTCAGCCCGACCGCTGCGCCTTATCCGGTAACTA 

25 TCGTCTTGAGTCCAACCCGGTAAGACACGACTTATCGCCACTGGCAGCAGCCACTGGT 
AACAGGATTAGCAG AG CG AG GTATGTAGGCG GTG CTAC AG AGTTCTTG AAGTG GTGG 
CCTAACTACGGCTACACTAGAAGGACAGTATTTGGTATCTGCGCTCTGCTGAAGCCAG 
TTACCT TCGGA AAAAGAGTTGGTAGCTCTTGATCCGGCAAACAAACCACCGCTGGTAG 
CGGTGG I I I I I I I GT TTGCAAGCAGCAGATTACGCGCAGAAAAAAAGGATCTCAAGAAG 

30 A TCCT TTGATCTTTTCTACGGGGTCTGACGCTCAGTGGAACGAAAACTCACGTTAAGGG 
AT TTTG GTCATGAGATTATCAAAAAGGATCTTCACCTAGATCCTTTTAAATTAAAAATGA 
AGTTTTAAATCAATCTAAAGTATATATGAGTAAACTTGGTCTGACAGTTACCAATGCTTA 
ATCAGTGAGGCACCTATCTCAGCGATCTGTCTATTTCGTTCATCCATAGTTGCCTGACT 
CCCCGTCGTGTAGATAACTACGATACGGGAGGGCTTACCATCTGGCCCCAGTGCTGCA 

as ATGATACCGCGAGACCCACGCTCACCGGCTCCAGATTTATCAGCAATAAACCAGCCAG 
CCGGAAGGGCCGAGCGCAGAAGTGGTCCTGCAACTTTATCCGCCTCCATCCAGTCTAT 
TAATTGTTGCCGGGAAGCTAGAGTAAGTAGTTCGCCAGTTAATAGTTTGCGCAACGTTG 
TTGCCATTGCTGCAGGCATCGTGGTGTCACGCTCGTCGTTTGGTATGGCTTCATTCAG 
CTCCGGTTCCCAACGATCAAGGCGAGTTACATGATCCCCCATGTTGTGCAAAAAAGCG 

40 GTTAGCTCCTTCGGTCCTCCGATCGTTGTCAGAAGTAAGTTGGCCGCAGTGTTATCAC 
TCATGGTTATGGCAGCACTGCATAATTCTCTTACTGTCATGCCATCCGTAAGATGCTTT 
TCTGTGACTGGTGAGTACTCAACCAAGTCATTCTGAGAATAGTGTATGCGGCGACCGA 
GTTGCTCTTGCCCGGCGTCAACACGGGATAATACCGCGCCACATAGCAGAACTTTAAA 
AGTGCTCATCATTGGAAAACGTTCTTCGGGGCGAAAACTCTCAAGGATCTTACCGCTG 

45 TTGAGATCCAGTTCGATGTAACCCACTCGTGCACCCAACTGATCTTCAGCATCTTTTAC 
TTTCACCAGCGTTTCTGGGTGAGCAAAAACAGGAAGGCAAAATGCCGCAAAAAAGGGA 
ATAAGGGCGACACGGAAATGTTGAATACTCATACTCTTCCTTTTTCAATATTATTGAAGC 
ATTTATCAGGGTTATTGTCTCATGAGCGGATACATATTTGAATGTATTTAGAAAAATAAA 
CAAATAGGGGTTCCGCGCACATTTCCCCGAAAAGTGCCACCTGACGTCTAAGAAACCA 

so TTATTATCATGACATTAACCTATAAAAATAGGCGTATCACGAGGCCCTTTCGTCTTCAA 
(SEQ ID NO:172) 
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Expression of the asp gene was investigated in a nine-protease delete Bacillus 
subtilis host. The plasmid pHPLT-ASP-C1-2 (See, Table 10-2, and Figure 9), was 
transformed into B. subtilis (AaprE, AnprE, Aepr, AispA, Abpr, Avpr, AwprA,Ampr-ybfJ, 
AnprB) and (deglf y 32, oppA, AspollE3501, amyE:(xylRPxylAcomK-ermC). Transformation 
was performed as known in the art (See e.g., WO 02/14490, incorporated herein by 
reference). The Asp protein was produced by growth in shake flasks at 37°C in MBD 
medium, a MOPS based defined medium. MBD medium was made essentially as known in 
the art (See, Neidhardt etal., J. Bacteriol., 119: 736-747 [1974]), except NH4CI2, FeS04, 
and CaCI2 were left out of the base medium, 3 mM K2HP04 was used, and the base 
medium was supplemented with 60 mM urea, 75 g/L glucose, and 1 % soytone. Also, the 
micronutrients were made up as a 100 X stock containing in one liter, 400 mg FeS04 
-7H20, 100 mg MnS04 .H20, 100 mg ZnS04.7H20, 50 mg CuCI2.2H20, 100 mg 
CoCI2.6H20, 100 mg NaMo04.2H20, 100 mg Na2B4O7.10H2O, 10 ml of 1M CaCI2 , and 
10 ml of 0.5 M sodium citrate. The expression levels obtained in these experiments were 
found to be fairly high. 

In additional embodiments, "consensus" promoters such as those developed through 
site-saturation mutagenesis to create promoters that more perfectly conform to the 
established consensus sequences for the "-10" and "-35" regions of the vegetative "sigma A- 
type" promoters for B. subtilis (See, Voskuil et al., Mol. Microbiol., 17:271-279 [1995]) find 
use in the present invention. However, it is not intended that the present invention be limited 
to any particular consensus promoter, as it is contemplated that other promoters that 
function in Bacillus cells will find use in the present invention. 

EXAMPLE 11 
Protease Production In Bacillus clausii 
In this Example, experiments conducted to produce protease 69B4 (also referred to 
as "Asp" herein) in B. clausii are described. In order to express the Asp protein in Bacillus 
clausii, it was necessary to use a promoter that works in this alkaliphilic microorganism due 
to its unique regulation systems. The production profile of the alkaline serine protease of 8. 
clausii PB92 (MAXACAL® protease) has shown that it has to have a strong promoter 
(referred to as "MXL-prom." herein; SEQ ID NOS:173, 174, and 175, See, Figure 18) with a 
delicate regulation. Besides the promoter region, also signal sequences (leader sequences) 
are known to be very important for secreting proteins in B. clausii. Therefore, 3 constructs 
were designed With the MAXACAL® protease promoter region and separate fusions of the 
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MAXACAL® protease leader sequence and the Asp leader sequence in front of the N- 
terminal Pro and the mature Asp protein with 3, 6 and 27 amino acids of the MAXACAL® 
protease leader fused to 25, 25 and 0 amino acids of the Asp leader, respectively. 

To make these constructs, amplification of DNA fragments needed to be done in 
order to enable the fusion. Therefore, PCRs were performed on both MAXACAL® protease 
and Asp template DNA with Phusion high fidelity polymerase (Finnzymes) according to the 
manufacturer's instructions. 

PGR reactions were executed with the following primers (bold indicates the 
MAXACAL® protease part of the primer) synthesized at MWG-Biotech AG: 

1: B. clau-3F: agggaaccgaatgaagaaacgaactgtcacaagagctctg (SEQ ID NO: 176) 
2: B. clau-3R: cagagctcttgtgacagttcgtttqttcattcggttccct (SEQ ID NO: 177) 
3: B. c!au-6F: aatgaagaaaccgttggggcgaactgtcacaagagctctg (SEQ ID NO:178) 
4: B. clau-6R: cagagctcttgtgacagttcgccccaacggtttcttcatt (SEQ ID NO:179) 
5: B. clau-27F: agttcatcgatcgcatcggctaacgaaccggctcctccagga (SEQ ID NO:180) 
6: B. clau-27R: tcctggaggagccggttcgttagccgatgcgatcgatgaact (SEQ ID NO:181) 
7: B. clau-vector 5': tcagggggatcctagattctgttaacttaacgtt. (SEQ ID NO:182) 

This primer contains the Hpal-site (GTTAAC) from the promoter region and a 

SamHI-site (GGATCC) for cloning reasons (both underlined). 
8: pHPLT-tf/ndlll-R: gtgctgttttatcctttaccttgtctcc. (SEQ ID NO:183). The sequence of 

this primer lays just upstream of the H/ndlll-site of pHPLT-ASP-C1-2 (See, TablelO- 

2). 



Table 11-1. PCR Setup to Create Fused MAXACAL® Protease-Asp Leader 

Fragments 


Template DNA 


Primer 1 


Primer 2 


Fragment Name 


PHPLT-ASP-C1-2 


1 


8 


3F 


PHPLT-ASP-C1-2 


3 


8 


6F 


PHPLT-ASP-C1-2 


5 


8 


27F 


PMAX4 


2 


7 


3R 


PMAX4 


4 


7 


6R 


PMAX4 


6 


7 


27R 


3F + 3R 


7 


8 


3F3R 


6F + 6R 


7 


8 


6F6R 


27F + 27R 


7 


8 


27F27R 
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In Table 11-1, "pMAX4" refers to the template described in WO 88/06623, herein 
incorporated by reference. PCR fragments 3F3R, 6F6R, 27F27R were digested with both 
BamH\ and HindlH. The digested PCR fragments were ligated with T4 ligase (Invitrogen) 
into BamH\ + H//7oflll-opened plasmid pHPLT-ASP-C1-2 (See, Figure 18). The ligation 
product was transformed to competent B. subtilis cells ((AaprE, AnprE, oppA, AspollE, 
<JegUHy32, AamyE^xylRtpxylA-comK; See e.g., WO 02/14490, incorporated herein by 
reference) and selected on neomycin (20 mg/l). Heart Infusion-agar plates containing 
neomycin were used to identify neomycin resistant colonies. DNA of the B. subtilis 
transformants was isolated using Qiagen's plasmid isolation kit according to manufacture's 
instructions, and were tested on the appearance of the fused MAXACAL® protease-Asp 
fragment by their pattern after digestion with both Nco\ + Hpa\ together in one tube. The 
restriction enzymes used in this Example (i.e., BamH\, Hindll, Nco\ and Hpa\) were all 
purchased from NEB, and used following the instructions of the supplier. DNA of £3. subtilis 
transformants that showed 2 bands with restriction enzymes (Nco\ + Hpa\) was used to 
transform protease negative B. clausii strain PBT142 protoplast cells (these were derived 
from PB92). 

The protoplast transformation of B. clausii strain PBT142 was performed according 
to the protocol mentioned for the protoplast transformation of S. alkalophilus (renamed B. 
clausii) strain PB92 in patent WO88/06623, herein incorporated by reference A modification 
to this protocol was the use of an alternative recipe for the regeneration plates, in that 
instead of 1.5% agar, 8.0 g/l Gelrite gellam gum (Kelco) was used. In addition, instead of 
1000 mg/l neomycin, 20 mg/l neomycin was used as described by Van der Laan et aL, (Van 
der Laan et aL, AppL Environ. Microbiol., 57:901-909 [1991]). 

DNA from all 3 constructs isolated from B. subtilis (see above) was transformed into 
B. c/aus//PBT142 protoplasts using the same protocol as above. Transformants in B. 
clausii PBT142 were selected by replica-plating on Heart Infusion agar plates containing 20 
mg/l neomycin. The B. clausii strains with the different construct were produced as 
indicated in Table 11-2. 



Table 1 1 -2. B. clausii Constructs 


Construct (length 
MAXACALI® protease 
leader) 


B. clausii Strain 


3 MXL/25ASP 


PMAX-ASP3 


6 MXU25ASP 


PMAX-ASP2 
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27 MXL/OASP 



PMAX-ASP1 



These 3 strains were fermented in shake flasks containing 100 ml Synthetic 
Maxatase Medium (SMM) (See, U.S. Pat. No. 5,324,653, herein incorporated by reference). 
However, instead of 0.97 g/l CaCI 2 .6H 2 0, 0.5 g/l CaCI 2 was used. Also, instead of 0.5 ml/l 
antifoam 5693, 0.25 ml/l Basildon was used. The 100 ml SSM shake flasks were inoculated 
with 0.2 ml of a pre-culture of the 3 B. clausii strains containing the leader constructs in 10 
ml TSB (Tryptone Soya Broth) with 20 mg/l neomycin. The protease production values were 
measured via the AAPF-assay (as described above) after growth in the shake flasks for 3 
days. The results indicated that these constructs were able to express protease with 
proteolytic activity. 

In an additional experiment, integration of the leader construct with the entire 
MAXACAL® protease leader length (27 amino acids) was investigated. However, it is not 
intended that the present invention be limited to any particular mechanism. 

Stable integration of heterologous DNA in the B. alcalophilus (now, B. clausii) 
chromosome is described in several publications (See e.g., WO 88/06623, and Van der 
Laan etal., supra). The procedure described in patent WO 88/06623 for integration of 1 or 
2 copies of the MAXACAL® protease gene in the chromosome of B. alcalophilus (now, B. 
clausii) was used to integrate at least 1 copy of the asp gene in the chromosome of B. 
clausii PBT142. However, a derivative of pE194-neo: pENM#3 (See, Figure 19) was used 
instead of the integration vector pE194-neo (to make pMAX4 containing the MAXACAL® 
protease gene). In the integration vector pENM#3, the Asp leader PGR product 27F27R was 
cloned in the unique blunt end site Hpa\ in between the 5' and the 3' flanking regions of the 
MAXACAL® protease gene. Therefore, 27F27R was made blunt-ended as follows: it was 
first digested with Hpa\ (5'end), purified with the Qiagen PCR purification kit, and then 
digested with HindlU (3'end). This treated PCR fragment 27F27R was purified again after 
HindlU digestion (using the same Qiagen kit) and filled in with dNTPs using T4 polymerase 
(Invitrogen) and purified again with Qiagen kit. The Hpal-opened pENM#3 and the blunt- 
ended PCR product 27F27R were ligated with T4 ligase (Invitrogen). The ligation product 
was transformed directly to B. clausii PBT142 protoplasts and selected after replica-plating 
on HI agar plates with 20 mg/l neomycin. Two transformants with the correct orientation of 
the asp gene in the integration vector were identified and taken into the integration 
procedure as described in patent WO 88/06623. Selections were done at 2 mg/l and 20 
mg/l neomycin for integration in the MAXACAL® protease locus and at an illegitimate locus, 
respectively. These results indicated that B. clausii Is also suitable as an expression host 
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EXAMPLE 12 
Protease Production in B. licheniformis 

In this Example, experiments conducted to produce protease 69B4 in B. licheniformis 
are described. During these experiments, various expression constructs were created to 
produce protease 69B4 protease (also referred to as "ASP protease") in Bacillus 
licheniformis. Constructs were cloned into expression plasmid pHPLT (replicating in 
Bacillus) and/or into integration vector pICatH. Plasmid pHPLT (See, Figure 17; and U.S. 
Pat. No. 6,562,612 [herein incorporated by reference) is a pUB110 derivative, has a 
neomycin resistance marker for selection, and contains the B. licheniformis a-amylase (LAT) 
promoter (Plat), a sequence encoding the LAT signal peptide (preLAT), followed by Psfl and 
Hpa\ restriction sites for cloning and the LAT transcription terminator. The pICatH vector 
(See, Figure 20) contains a temperature sensitive origin of replication (ori pE194, for 
replication in Bacillus), ori pBR322 (for amplification in E. coli), a neomycin resistance gene 
for selection, and the native B. licheniformis chloramphenicol resistance gene (cat) with 
repeats for selection, chromosomal integration and cassette amplification. 

Construct ASPd was created as a Psfl-Hpal fragment by fusion PCR with High 
Fidelity Platinum Taq Polymerase (Invitrogen) according to the manufacturer's instructions, 
and with the following primers: 

pHPLT-Bp/lLFW AGTTAAGCAATC AG ATCTTCTTCAG GTTA (SEQ ID NO:184) 

f usionCI J=W CATTG AAAGGGG AGG AGAATCATG AG AAGCAAGAAGCG AACTGTCAC 
(SEQ ID NO:185) 

fusionC1_RV GTGACAGTTCGCTTCTTGCTTCTCATGATTCTCCTCCCCTTTCAATG 
(SEQIDNO:186) 

pHPLT-H//7dllLRV CTTTACCTTGTCTCCAAGCTTAAAATAAAAAAACGG (SEQ ID 
NO:187) 



These primers were obtained from MWG Biotech. PCR reactions were typically 
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performed on a thermocycler for 30 cycles with High Fidelity Platinum Taq polymerase 
(Invitrogen) according to the manufacturer's instructions, with annealing temperature of 
55°C. PCR-I was performed with the primers pHPLT-BgllLFW and fusionC1_RV on pHPLT 
as template DNA. PCR-II was performed with primers fusionC1_FW and pHPLTHindllLRV 
on plasmid pHPLT-ASP-C1-2. The fragments from PCR-I and PCR-II were assembled in a 
fusion PCR with the primers pHPLT-BgllLFW and pHPLT-HindllLRV. This final PCR 
fragment was purified using the Qiagen PCR purification kit, digested with BgH\ and H/ndlll, 
and ligated with T4 DNA ligase according to the manufacturers' instructions into BgH\ and 
H/ndlll digested pHPLT. The ligation mixture was transformed into ft subtilis strain OS14 
as known in the art (See, U.S. Pat. Appl. No. US20020182734 and WO 02/14490, both of 
which are incorporated herein by reference). Correct transformants produced a halo on a 
skimmed milk plate and one of them was selected to isolate plasmid pHPLT-ASPd. This 
plasmid was introduced into ft licheniformis host BML780 (BRA7 derivative, cat-, amyL-, 
spo-, aprL-, endoGluC-) by protoplast transformation as known in the art (See, Pragai et al., 
Microbiol., 140:305-310 [1994]). Neomycin resistant transformants formed halos on skim 
plates, whereas the parent strain without pHPLT-ASPd did not. This result shows that ft 
licheniformis is capable of expressing and secreting ASP protease when expression is 
driven by the LAT promoter and when it is fused to a hybrid signal peptide 
(M RS KKRTVTR ALA VATAAATLLAG G M A AQ A; SEQ ID NO: 135). 

Construct ASPc3 was created as a Psi\-Hpa\ fragment by fusion PCR (necessary to 
remove the internal Psfl site in the synthetic asp gene) as described above with the 
following primers: 

ASPdelPsfl_FW GCGCAGGATGTAGCAGCTGGACTTGTGG (SEQ ID NO: 188) 

ASPdelPs/LRV CCACAAGTCCAGCTGCTACATCCTGCGC (SEQ ID NO:189) 

AspPsfLFW GCCTCATTCTGCAGCTTCAGCAAACGAACCGGCTCCTCCAGG 
(SEQ ID NO: 190) 

AspHpaLRV CGTCCTCTGTTAACTCAGTCGTCACTTCCAGAGTCAGTCGTAATC 
(SEQ ID NO: 191) 

After purification, the PCR product was digested with Pst\-Hpa\ and ligated into Psfl 
and Hpal digested pHPLT and then transformed into B. subtilis strain OS14. Plasmid 
pHPLT-ASPc3 was isolated from a neomycin resistant that formed a relatively (compared to 
other transformants) large halo on a skim milk plate. Plasmid DNA was isolated using the 
Qiagen plasmid purification kit and sequenced by BaseClear. 
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Sequencing confirmed that the ASPc3 construct encodes mature ASP that has two 
aspartic acid residues at the extreme C-terminal end (S188D, P189D). These mutations 
were deliberately introduced by PGR to make the C-terminus of ASP less susceptible 
against proteolytic degradation (See, WO 02055717). It also appeared that two mutations 
were introduced into the coding region of the N-terminal pro region by the PCR methods. 
These mutations caused two amino acid changes in the N-terminal pro-region: L42I and 
Q141 P. Since this particular clone with these two pro(N) mutations gives a somewhat larger 
halo than other clones without these mutations, it was contemplated that expression and/or 
secretion of ASP protease in Bacillus is positively affected by these N-terminal pro 
mutations. However, it is not intended that the present invention be limited to these specific 
mutations, as it is also contemplated that further mutations will find use in the present 
invention. 

Next, pHPLT-ASPc3 was transformed into BML780 as described above. In contrast 
to the parental strain without the plasmid, BML780(pHLPT-ASPc3) produced a halo on a 
skim milk plate indicating that also this ASPc3 construct leads to ASP expression in ft 
licheniformis. To make an integrated, amplified strain containing the ASPc3 expression 
cassette, the C3 construct was amplified from pHPLT-ASPc3 with the following primers: 

EBS2X/70LFW ATCCTACTCGAGGCTTTTCTTTTGGAAGAAAATATAGGG (SEQ ID 

NO:192) 

EBS2X/70LRV TGGAATCTCGAGGTTTTATCCTTTACCTTGTCTCC (SEQ ID 

NO: 193) 

The PCR product was digested with Xho\, ligated into Xfiol-digested pICatH (See, 
Figure 20) and transformed into ft subtilis OS14 as described above. The plasmid from an 
ASP expressing clone (judged by halo formation on skim milk plates) was isolated and 
designated plCatH-ASPc3. DNA sequencing by BaseClear confirmed that no further 
mutations were introduced in the ASPc3 cassette in plCatH-ASPC3. The plasmid was then 
transformed into BML780 at the permissive temperature (37 °C) and one neomycin resistant 
(neoR) and chloramphenicol resistant (capR) transformant were selected and designated 
BML780(plCatH-ASPc3). The plasmid in BML780(plCatH-ASPc3) was integrated into the 
cat region on the ft licheniformis genome by growing the strain at a non-permissive 
temperature (50 °C) in medium with chloramphenicol. One capR resistant clone was 
selected and designated BML780-plCatH-ASPc3. BML780-plCatH-ASPc3 was grown again 
at the permissive temperature for several generations without antibiotics to loop-out vector 
sequences and then one neomycin sensitive (neoS), capR clone was selected. In this 



WO 2005/052146 



PCT/US2004/039066 



-188- 

clone, vector sequences of pICatH on the chromosome were excised (including the 
neomycin resistance gene) and only the ASPc3-cat cassette was left. Note that the cat 
gene is a native B. licheniformis gene and that the asp gene is the only heterologous piece 
of DNA introduced into the host. Next, the ASPc3-cat cassette on the chromosome was 
amplified by growing the strain in/on media with increasing concentrations of 
chloramphenicol. After various rounds of amplification, one clone (resistant against 75 
ug/ml chloramphenicol) was selected and designated "BML780-ASPc3." This clone 
produced a clear halo on a skim milk plate, whereas the parental strain BML780 did not, 
indicating that ASP protease is produced and secreted by the BML780-ASPc3 strain. 

Construct ASPc4 is similar to ASPc3, but ASP protease expressed from ASPc4 does 
not have two aspartic acid residues at the C-terminal end of the mature chain. ASPc4 was 
created by amplification of the asp gene in pHPLT-ASPc3 with the following Hypur primers 
from MWG Biotech (Germany): 

XhoPlatPREIat_FW 

acccccctcgaggcttttcttttggaagaaaatatagggaaaatggtacttgttaaaaattcggaatatttatacaatatcatatgtttc 
acattgaaaggggaggagaatcatgaaacaacaaaaacggctttac (SEQ ID NO: 194) 

AS PendTER MXhol_RV 

gtcgacctcgaggttttatcctttaccttgtctccaagcttaaaataaaaaaacggatttccttcaggaaatccgtcctctgttaactc 
aaggggaacttccagagtcagtcgtaatc (SEQ ID NO: 195) 

The ASPc4 PCR product was purified and digested with Xho\, ligated into XhcA- 
digested pICatH, and transformed into B. subtilis OS14 as described above for ASPc3. 
Plasmid was isolated from a neoR, capR clone and designated plCatH-ASPc4. pICatH- 
ASPc4 was transformed into BML780, integrated in the genome, vector sequences were 
excised, and the cat-ASPc4 cassette was amplified as described above for the ASPc3 
construct. Strains with the ASPc4 cassette did not produce smaller halos on skim milk 
plates than strains with the AspC3 cassette, suggesting that the polarity of the C-terminus of 
ASP mature is not a significant factor for ASP production, secretion and/or stability in 
Bacillus. However, it is not intended that the present invention be limited to any particular 
method. 

To explore whether the native ASP signal peptide can drive export in Bacillus, ASPc5 
was constructed. PCR was performed on the synthetic asp gene of DNA2.0 with primers 
ASPendTERMXhol_RV (above) and XhoPlatPREasp_FW. 
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XhoPlatPREasp_FW 

:acccccctcgaggcttttcttttggaagaaaatatagggaaaatggtacttgttaaaaattcggaatatttatacaatatcatatgttt 
cacattgaaaggggaggagaatcatgacaccacgaactgtcacaag (SEQ ID NO:196) 

The ASPc5 PCR product was purified and digested with Xho\, ligated into Xho\ 
digested pICatH, and transformed into B. subtilis OS14 as described above for ASPc3. 
Plasmid.was isolated from a neoR, capR clone and designated "plCatH-ASPc5." DNA 
sequencing confirmed that no unwanted mutations were introduced into the asp gene by the 
PCR. plCatH-ASPc5 was transformed into BML780, integrated in the genome, vector 
sequences were excised, and the cat-ASPc5 cassette was amplified as described above for 
the ASPc3 construct. It was observed that B. licheniformis strains with the ASPc5 construct 
also form halos on skim milk plates, confirming that the native signal peptide of ASP 
functions as a secretion signal in Bacillus species. . 

Finally, construct ASPc6 was created. It has the B. licheniformis subtilisin (aprL) 
promoter, RBS and signal peptide sequence fused in-frame to the DNA sequence encoding 
mature ASP from the optimized DNA2.0 gene. It was created by a fusion PCR with primer 
ASPendTERMXhoLRV and the following primers: 

AprLupXhoLFW attagtctcgaggatcgaccggaccgcaacctcc (SEQ ID NO:197) 
AprLAsp_FW cgatggcattcagcgattccgcttctgctaacgaaccggctcctccaggatctgc (SEQ ID 
NO:198) 

AprLAsp_RV gcagatcctggaggagccggttcgttagcagaagcggaatcgctgaatgccatcg (SEQ 

ID NO:199) 

PCR-I was performed with the primers AprLupXhoLFW and AprLAsp_RV on 
chromosomal DNA of BRA7 as template DNA. PCR-II was performed with primers 
AprLAsp_FW and ASPendTERMXhoLRV on the synthetic asp gene of DNA2.0. The 
fragments from PCR-I and PCR-II were assembled in a fusion PCR with the primers 
ASPendTERMXhoLRV and AprLupXhoLFW. This final PCR fragment was purified using 
Qiagen's PCR purification kit (according to the manufacturer's instructions), digested with 
Xho\, ligated into pICatH, and transformed into B. subtilis OS14, as described above for 
ASPc3. Plasmid was isolated from a neoR, capR clone and designated "plCatH-ASPc6." 
DNA sequencing confirmed that no unwanted mutations were introduced into. the asp gene 
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or aprL region by the PCRs. P iCatH-ASPc6 was transformed into BML780, integrated in the 
genome, vector sequences were excised, and the cat-ASPc6 cassette was amplified as 
described above for the ASPc3 construct. B. licheniformis strains with the ASPc6 construct 
also formed halos on skim milk plates, indicating that the aprL promoter in combination w,th 
the AprL signal peptide drives expression/secretion of ASP protease in B. licheniformis. 

EXAMPLE 13 
Protease Production in T. reesei 

in this Example, experiments conducted to produce protease 69B4 in T. reesei are 
described. In these experiments, three different fungal constructs (fungal expression 
vectors comprising cbhl fusions) were developed. One contained the ASP 5' pro reg.on, 
mature gene, and 3' pro region; the second contained the ASP 5' pro region and the mature 
gene; and the third contained only the ASP mature gene. 

The following primer pairs were used to PCR (in the presence of 10% DMSO), the 
different fragments from the chromosomal DNA K25.10, carrying the ASP gene and 
introduced Spe\-Asc\ sites to clone the fragments into the vector pTREX4 (See, Figure 21) 
digested with Spel and Asd restriction enzymes. 



CBHI fusion with the ASP 5'pro region, mature gene, and 3'pro region: 
AspproF forward primer (Spel-Kexin site-ATG-pro sequence): 

5--ACTAGTAAGCGGATGAACGAGCCCGCACCACCCGGGAGCGCGAGC (SEQ ID 
NO:200) 

AspproR reverse primer (Ascl site; C-term pro region from the TAA stop codon to the 
end of the gene): 

5'- GGCGCGCC TTA GGGGAGGGTGAGCCCCATGGTGTAGGCACCG (SEQ ID 
NO:201) 

The ASP 5'pro region and mature gene: 
ID NO:202) 
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• ~ r /acH *ite- TAA stop to the end of the mature sequence) 
SffloSSc ^TcStG C^AGTCCGTGGTGATCA-^ (SEO ID 

NO:203) 

3. The ASP mature gene only: 
(SEQ ID NO:204) 

NO:205) 

After construction, the different ptasmids were transformed into a Tnchcderma reesei 
strain with disruptions in the Ml. <*K eg,1, and e g,2,enes. using bioiistic <™*<" 
m ethods known in the art. Stable transients were screened, based on morphology. Ten 
stable franeformants for each construct were screened in shake flasks. The initial rnoculum 
media used contained 30g/L a-lactose, 6.5g/L (NH 4 fcS0 4 , 2glL KH 2 PO„ 0.3g/L 
rsO.-7H 2 O,0. 2 g/LCAC fe , tmi/LfOOOX T. reesefTrace Sans, 2 mUL ,0%TWEEN^80, 
22 5 g/L Proflo, and 0.72g/l- CaCO* in which the transformants were grown for 
approximately 48 hr. After mis incubation period, 10% of me culture was transferred into 
(tasks containing minimal medium known in .he art (See, Foreman et a/., J. BroL Chem., 
278-31988-31997 [2003]), with 16fl/L of lactoee to induce expression. The flasks were 
piaced in a 28'C shake, Four-day sampies were run on NuPAGE 4-12% gels, and stemed 
with coomassie Biue. After five-days me protease activity was measured by add,ng 10pl of 
the supernatant to190 pi AAPF substrate solution (cone. 1 mg/ml, in 0.1 MTns/0.005* 
TWEEN, pH 8.6). The rate of increase in absorbance at 410 nm due to release of p- 

nitroaniline was monitored (25°C) m 

The activity data showed that there was a 5x higher production over the control stram 
(i.e., the parent strain), indicating that T. reeseiis suitable for the expression of ASP 
protease. 



EXAMPLE 14 
Protease Production in A. niger 

in this Example, experiments conducted to produce protease 69B4 in Aspens 
n/ger var. awamorf (PCT WO90/00192) are described. In these experiments, four dflerent 
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fungal constructs (fungal expression vectors comprising glaA fusions) were developed. One 
contained the ASP pre-region, 5' pro-region, mature gene, and the 3' pro-region: the second 
contained the ASP pre-region, 5' pro-region, and the mature gene; the third contained the 
ASP 5' pro-region, mature gene, and the 3' pro-region; the fourth contained the ASP 5' pro- 
region, and the mature gene. 

Selected from the following primer pairs, primers were used to PCR (in the presence 
of 10% DMSO) the different fragments from the chromosomal DNA 69B4 carrying the asp 
gene and introduced the Nhe 1-BsfSI sites to clone the fragments into the vector 
P SLGAMpR2 (See, Figure 22) digested with Nhe* and BstB\ restriction enzymes. 

Primers Anforward 01 and Anforward 02 contained attB1 Gateway cloning 
sequences (Invitrogen) at the 5' end of the primer. Primers Anreversed 01 and Anreversed 
02 contained attB2 Gateway cloning sequences (Invitrogen) at the 5' end of the primer. 
These primers were used to PCR (in the presence of 10% DMSO) the different fragments 
from the chromosomal DNA 69B4 carrying the ASP genes. 

The different constructs were transferred to a A. niger Gateway compatible 
destination vector P RAXdes2 {See, Figure 23; See also, U.S. Pat. Appln. Ser. No. 
10/804,785, and PCT Appln. No. US04/08520, both of which are incorporated herein by 
reference). 

Anforward 01 (without the attB1 sequence) 

5"- ATGACACCACGAACTGTCACAAGAGCTCTG-3 (SEQ ID NO:206) 
Anforward 02 (without the attB1 sequence) 

5'- AACG AACCG GCTCCTCC AG G ATCTGCATCA-3' (SEQ ID NO:207) 
Anreversed 01 (without the attB2 sequence) 

5»- AGGGGAACTTCCAGAGTCAGTCGTAATCATTCTCAGGCC-3' (SEQ ID NO.208) 
Anreversed 02 (without the attB1 sequence) 

5'- GGGGAGGGTGAGTCCCATTGTGTAAGCTCCTGA-3' (SEQ ID NO:209) 
pSLGAM-NT_FW 

ACCGCGACTGCTAGCAACGTCATCTCCAAGCGCGGCGGTGGCAACGAACCGGCTCCT 
CCAGGATCt-3' (SEQ ID NO:210) 

pSLGAM-MAT_FW 

ACCGCGACTGCTAGCAACGTCATCTCCAAGCGCGGCGGTGGCAACGAACCGGCTCCT 
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CCAGGATCT-3*(SEQ ID NO:211) 
pSLGAM-MAT_RV 

5--CCGCCAGGTGTCGGTCACCTAAGGGGAACTTCCAGAGTCAGTCGTAATCATTCT-3 
(SEQ ID NO:212) 

PCR conditions were as follows: 5 uL of 10X PCR reaction buffer (Invitrogen); 20 
mM MgS0 4 ; 0.2 mM each of dATP, dTTP, dGTP, dCTP (final concentration), 1 uL of 10 
ng/uL genomic DNA, 1 uL of High Fidelity Tag polymerase (Invitrogen) at 1 unit per uL, 
0 2uM of each primer (final concentration), 5ul DMSO and water to 50 uL The PCR 
protocol was: 94'C for 5 min.; followed by 30 cycles of 94'C for 30 sec, 55'C for 30 sec, 
and 68°C for 3 min; followed by 68°C for 10 min., and 1 5'C for 1 min. 

After construction, the different plasmids and a helper plasmid (HM 396 pAPDI) 
were transformed into Aspergillus niger var awamori (Delta Ap4 strain), using protoplast 
transformation methods known in the art. Stable transformants were screened, based on 
morphology. Ten stable transformants for each construct were screened in shake flasks. 
After this period, a piece of agar containing the strain was transferred into flasks containing 
RoboSoy medium or the formula 12 g/l Tryptone, 8 g/l Soytone, 15 g/l Ammonium sulfate, 
12 1 g/l NaH 2 P0 4 .H z O, 2.19 g/l Na 2 HP0 4 , 5 ml 20% MgS04.7H20, 10 ml 10% Tween 80, 
500 ml 30% Maltose and 50 ml 1M phosphate buffer pH 5.8 and 2 g/l uridine to induce 
expression. The flasks were placed in a 28'C shaker. Four-day samples were run on 
NuPAGE 10% Bis Tris protein gels, and stained with Coomassie Blue. Five-day samples 
were assayed for protease activity using the AAPF method. 

The amount of ASP expressed was found to be low, such that it could not be 
detected in the Coomassie stained gel. Colonies on plates however showed a clear halo 
formation on skim milk plate agar plates that were significantly larger than the control stra.n. 
Thus, although the expression was low, these results clearly indicate that A niger is suitable 
for the expression of ASP protease. 



EXAMPLE 15 

Generation of Asp Site-Saturated Mutagenesis (SSM) Libraries 

In this Example, experiments conducted to develop site-saturation mutagenesis 
libraries of asp are described. Site saturated Asp libraries each contained 96 B. subtilis 
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(AaprE, MprE, oppA, AspollE, degUHy32, AamyE::{xylR,pxylA-comK) clones harboring the 
pHPLT-ASP-d-2 expression vector. This vector, containing the Asp expression cassette 
composed of the synthetic DNA sequence (See, Example 10) encoding the Asp hybrid 
Signal peptide and the Asp N-terminal pro and mature protein were found to enable 
expression of the protein indicated below (the signal peptide and precursor protease) and 
secretion of the mature Asp protease. 

DNA Sequence encoding synthetic Asp hybrid signal peptide: 

ATGAGAAGCAAGAAGCGAACTGTCACAAGAGCTCTGGCTGTGGCAACAGCAGCTGCTA 
CACTCTTGGCTGGGGGTATGGCAGCACAAGCT (SEQ ID NO:213) 

The signal peptide and precursor protease are provided in the following sequence (SEQ ID 
NO:214) (in this sequence, bold indicates the mature protease, underlining indicates the N- 
terminal prosequence, and the standard font indicates the signal peptide): 

MRSKKRTVTRALAVATAAATLLAGGMAAQA ^IFPAPPGSASAPPRLAEKLDPDLLEAMERDL 

GJ nAFFAAATLAPOHnAAETGEAL APFI nFDFAGTWVFPPVI YVATTDFDAVEEVEGEGA 
TANm/EHSLADl^AWKTVIDAAL F nHnnVPTWYVDVPTNSvVVAVKAGAQDVAAGLVEGA 

nx/pgnAVTFVETDETPRTM FDVIGGNAYTIGGRSRCSIGFAVNGGFITAGHCGRTGATTAN 

PTGTFAGSSFPGNDYAFVRTGAGVNLLAQVNNYSGGRVQVAGHTAAPVGSAVCRSGSTT 

GWHCGTITALNSSVTYPEGTVRGLIRTTVCAEPGDSGGSLLAGNQAQGVTSGGSGNCRT 

GGTTFFQPVNPILQAYGLRMITTDSGSSP (SEQ ID NO:214) 

Construction of the189 asp site saturated mutagenesis libraries was completed by 
using the pHPLT-ASP-C1-2 expression vector as template and primers listed in Table 15-1. 
The mutagenesis primers used in these experiments all contain the triple DNA sequence 
code NNS (N = A, C, T or G and S = C or G) at the position that corresponds with the codon 
of the Asp mature sequence to be mutated and guaranteed random incorporation of 
nucleotides at that position. Construction of each SSM library started with two PCR 
amplifications using pHPLT-Bglll-FW primer and a specific Reverse mutagenesis primer, 
and pHPLT-Bglll-RV primer and a specific Forward mutagenesis primer (equal positions for 
the mutagenesis primers). Platinum Taq DNA polymerase High Fidelity (Cat.No. 11304- 
029- invitrogen) was used for PCR amplification (0.2 uM primers, 20 up to 30 cycles) 
according to protocol provided by Invitrogen. Briefly, 1 uL amplified DNA fragment of both 
specific PCR mixes, both targeted the same codon, was added to 48 M L of fresh PCR 
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reaction solution together with primers pHPLT-Bglll-FW and pHPLT-Bgll!-RV. This fusion 
PCR amplification (22 cycles) resulted in a linear pHPLT-ASP-c1-2 DNA fragment with a 
specific Asp mature codon randomly mutated and a unique Bgl\\ restriction site on both 
ends. Purification of this DNA fragment (Qiagen PCR purification kit, Cat.No. 28106), 
digesting it with BgH\ t performing an additional purification step and a ligation reaction 
(Invitrogen T4 DNA Ligase (Cat.No. 15224-025) generated circular and multimeric DNA that 
was subsequently transformed into B. subtilis (AaprE, LnprE, oppA, AspollE, deg(JHy32, 
AamyE::(xylR,pxylA-comK). For each library, after overnight incubation at 37°C, 96 single 
colonies were picked from Heart Infusion agar plates with 20 mg/L neomycin and grown for 
4 days at 37°C in MOPS media with 20 mg/ml neomycin and 1 .25 g/L yeast extract (See, 
WO 03/062380, incorporated herein by reference, for the exact medium formulation used 
herein) for sequence analysis (BaseClear) and protease expression for screening purposes. 
The library numbers ranged from 1 up to 189, with each number representing the codon of 
the mature asp sequence that is randomly mutated. After selection, each library included a 
maximum of 20 Asp protease variants. 



Table 

pHPLT-Bglll-FW 
pHPLT-Bglll-RV 

Forward 

Mutagenesis 

Primer 

asplF 

asp2F 

asp3F 

asp4F 

asp5F 

asp6F 

asp7F 
asp8F 



15-1. Primers Used to Generate Synthetic ASP SSM Libraries 

GCAATCAGATCTTCCTTCAGGTTATGACC (SEQ ID N215) 
GCATCGAAGATCTGATTGCTTAACTGCTTC (SEQ IDNO:216) 



DNA sequence, 5' to 3' 

GAAACGCCTAGAACGATGNNSGACGTAATTGGAGGCAAC 
(SEQ ID NO:217) 

ACGCCTAGAACGATGTTCNNSGTAATTGGAGGCAACGCA 
(SEQ ID NO:218) 

CCTAGAACGATGTTCGACNNSATTGGAGGCAACGCATAT 
(SEQ ID NO:219) 

AGAACGATGTTCGACGTANNSGGAGGCAACGCATATACT 
(SEQ ID NO:220) 

ACGATGTTCGACGTAATTNNSGGCAACGCATATACTATT 
(SEQ ID NO:221) 

ATGTTCGACGTAATTGGANNSAACGCATATACTATTGGC 
(SEQ ID NO:222) 

TTCG ACGTAATTG GAG G CN NS G C ATAT ACT ATTG G CG G C 
(SEQ ID NO:223) 

G AC GTA ATTG G AG G C AAC N NSTATACTATTG G CG G CCG G 
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(SEQ ID NO:224) 

GTAATTGGAGGCAACGCANNSACTATTGGCGGCCGGTCT 

asp9F (SEQ ID N0225) 

ATTG G AGGCAACGCATATNNS ATTGG CG GCCG GTCTAG A 
asplOF (SEQ ID NO:226) 

GGAGGCAACGCATATACTNNSGGCGGCCGGTCTAGATGT 

asp11F (SEQ ID NO:227) 



GGCAACGCATATACTATTNNSGGCCGGTCTAGATGTTCT 

asp12F (SEQ ID NO:228) 

AACGCATATACTATTGGCNNSCGGTCTAGATGTTCTATC 

asp13F (SEQ ID NO:229) 

GCATATACTATTGGCGGCNNSTCTAGATGTTCTATCGGA 

aspUF (SEQ ID NO:230) 

TATACTATTGGCGGCCGGNNSAGATGTTCTATCGGATTC 

asp15F (SEQIDNO:231) 

ACTATTGGCGGCCGGTCTNNSTGTTCTATCGGATTCGCA 

aspl 6F (SEQ ID NO:232) 

ATTG GCG GCCGGTCTAG ANN STCTATCG G ATTCGCAGTA 
asp17F (SEQ ID NO:233) 

GGCGGCCGGTCTAGATGTNNSATCGGATTCGCAGTAAAC 

asp18F (SEQ ID NO:234) 

GGCCGGTCTAGATGTTCTNNSGGATTCGCAGTAAACGGT 

asp19F (SEQ ID NO:235) 

CGGTCTAGATGTTCTATCNNSTTCGCAGTAAACGGTGGC 

asp20F (SEQ ID NO:236) 

TCTAGATGTTCTATCGGANNSGCAGTAAACGGTGGCTTC 

asp21F (SEQ ID NO:237) 

AGATGTTCTATCGGATTCNNSGTAAACGGTGGCTTCATT 

asp22F (SEQ ID NO:238) 

TGTTCTATCGGATTCGCANNSAACGGTGGCTTCATTACT 

asp23F (SEQ ID NO:239) 

TCTATCGGATTCGCAGTANNSGGTGGCTTCATTACTGCC 

asp24F (SEQ ID NO:240) 

ATCGGATTCGCAGTAAACNNSGGCTTCATTACTGCCGGT 

asp25F (SEQIDNO:241) 

GGATTCGCAGTAAACGGTNNSTTCATTACTGCCGGTCAC 

asp26F (SEQ ID NO:242) 

TTCGCAGTAAACGGTGGCNNSATTACTGCCGGTCACTGC 

asp27F (SEQ ID NO:243) 

GCAGTAAACGGTGGCTTCNNSACTGCCGGTCACTGCGGA 

asp28F (SEQ ID NO:244) 

GTAAACGGTGGCTTCATTNNSGCCGGTCACTGCGGAAGA 

asp29F (SEQ ID NO:245) 

AACGGTGGCTTCATTACTNNSGGTCACTGCGGAAGAACA 

asp30F (SEQ ID NO:246) 

G GTGG CTTCATTACTG CCNN SCACTGCG G A AG AACAGG A 
asp31F (SEQ ID NO:247) 
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GGCTTCATTACTGCCGGTNNSTGCGGAAGAACAGGAGCC 


asp32F 


(SEQ ID NO:248) 




TTCATTACTGCCGGTCACNNSGGAAGAACAGGAGCCACT 


asp33F 


(SEQ ID NO:249) 




ATTACTGCCGGTCACTGCNNSAGAACAGGAGCCACTACT 


asp34F 


(SEQ ID NO:250) 




ACTGCCGGTCACTGCGGANNSACAGGAGCCACTACTGCC 


asp35F 


(SEQ ID NO:251) 




GCCGGTCACTGCGGAAGANNSGGAGCCACTACTGCCAAT 


asp36F 


(SEQ ID NO:252) 




GGTCACTGCGGAAGAACANNSGCCACTACTGCCAATCCG 


asp37F 


(SEQ ID NO:253) 




CACTGCGGAAGAACAGGANNSACTACTGCCAATCCGACT 


asp38F 


(SEQ ID NO:254) 




TGCGGAAGAACAGGAGCCNNSACTGCCAATCCGACTGGC 


asp39F 


(SEQ ID NO:255) 


GGAAGAACAGGAGCCACTNNSGCCAATCCGACTGGCACA 


asp40F 


(SEQ ID NO:256) 




AGAACAGGAGCCACTACTNNSAATCCGACTGGCACATTT 


asp41 F 


(SEQ ID NO:257) 




ACAGGAGCCACTACTGCCNNSCCGACTGGCACATTTGCA 


asp42F 


(SEQ ID NO:258) 




GGAGCCACTACTGCCAATNNSACTGGCACATTTGCAGGT 


asp43F 


(SEQ ID NO:259) 




GCCACTACTGCCAATCCGNNSGGCACATTTGCAGGTAGC 


asp44F 


(SEQ ID NO:260) 




ACTACTGCCAATCCGACTNNSACATTTGCAGGTAGCTCG 


asp45F 


(SEQ ID NO:261) 




ACTGCCAATCCGACTGGCNNSTTTGCAGGTAGCTCGTTT 


asp46F 


(SEQ ID NO:262) 




GCCAATCCGACTGGCACANNSGCAGGTAGCTCGTTTCCG 


asp47F 


(SEQ ID NO:263) 




AATCCGACTGGCACATTTNNSGGTAGCTCGTTTCCGGGA 


asp48F 


(SEQ ID NO:264) 


CCGACTGGCACATTTGCANNSAGCTCGTTTCCGGGAAAT 


asp49F 


(SEQ ID NO:265) 


ACTGGCACATTTGCAGGTNNSTCGTTTCCGGGAAATGAT 


asp50F 


(SEQ ID NO:266) 


GGCACATTTGCAGGTAGCNNSTTTCCGGGAAATGATTAT 


asp51 F 


(SEQ ID NO:267) 


ACATTTGCAGGTAGCTCGNNSCCGGGAAATGATTATGCA 


asp52F 


(SEQ ID NO:268) 


TTTGCAGGTAGCTCGTTTNNSGGAAATGATTATGCATTC 


asp53F 


(SEQ ID NO:269) 


GCAGGTAGCTCGTTTCCGNNSAATGATTATGCATTCGTC 


asp54F 


(SEQ ID NO:270) 


GGTAGCTCGTTTCCGGGANNSGATTATGCATTCGTCCGA 


asp55F 


(SEQ ID NO:271) 


AGCTCGTTTCCGGGAAATNNSTATGCATTCGTCCGAACA 


asp56F 


(SEQ ID NO:272) 


asp57F 


TCGTTTCCGGGAAATGATNNSGCATTCGTCCGAACAGGG 
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(SEQ ID NO:273) 

TTTCCGGGAAATGATTATNNSTTCGTCCGAACAGGGGCA 

asp58F (SEQ ID NO:274) 

CCGGGAAATGATTATGCANNSGTCCGAACAGGGGCAGGA 

asp59F (SEQ ID NO:275) 

G G AAATG ATTATGCATTCN NSCG AACAGGG G CAG G AGTA 

asp60F (SEQ ID NO:276) 

AATGATTATGCATTCGTCNNSACAGGGGCAGGAGTAAAT 

asp61F (SEQ ID NO:277) 

GATTATGCATTCGTCCGANNSGGGGCAGGAGTAAATTTG 

asp62F (SEQ ID NO:278) 

TATGCATTCGTCCGAACANNSGCAGGAGTAAATTTGCTT 

asp63F (SEQ ID NO:279) 

GCATTCGTCCGAACAGGGNNSGGAGTAAATTTGCTTGCC 

asp64F (SEQ ID NO:280) 

TTCGTCCG AACAG GG GCAN NSGTAAATTTGCTTG CCCAA 

asp65F (SEQ ID NO:281) 

GTCCGAACAGGGGCAGGANNSAATTTGCTTGCCCAAGTC 

asp66F (SEQ ID NO:282) 

CGAACAGGGGCAGGAGTANNSTTGCTTGCCCAAGTCAAT 

asp67F (SEQ ID NO:283) 

ACAGGGGCAGGAGTAAATNNSCTTGCCCAAGTCAATAAC 

asp68F (SEQ ID NO:284) 

GGGGCAGGAGTAAATTTGNNSGCCCAAGTCAATAACTAC 

asp69F (SEQ ID NO:285) 

GCAGGAGTAAATTTGCTTNNSCAAGTCAATAACTACTCG 

asp70F (SEQ ID NO:286) 

GGAGTAAATTTGCTTGCCNNSGTCAATAACTACTCGGGC 

asp71F (SEQ ID NO:287) 

GTAAATTTGCTTGCCCAANNSAATAACTACTCGGGCGGC 

asp72F (SEQ ID NO:288) 

AATTTGCTTGCCCAAGTCNNSAACTACTCGGGCGGCAGA 

asp73F (SEQ ID NO:289) 

TTGCTTGCCCAAGTCAATNNSTACTCGGGCGGCAGAGTC 

asp74F (SEQ ID NO:290) 

CTTGCCCAAGTC AATAACN NSTCG G GCGGCAG AGTCCAA 
asp75F (SEQ ID NO:291) 

GCCCAAGTCAATAACTACNNSGGCGGCAGAGTCCAAGTA 
asp76F (SEQ ID N0292) 

CAAGTCAATAACTACTCG N NSGGCAG AGTCC AAGTAG CA 
asp77F (SEQ ID NO:293) 

GTC AATA ACT ACTCG G GC N N SAG AGTCC AAGTAG CAG G A 

asp78F (SEQ ID NO:294) 

AATAACTACTCGGGCGGCNNSGTCCAAGTAGCAGGACAT 

asp79F (SEQ ID N0295) 

AACTACTCGGGCGGCAGANNSCAAGTAGCAGGACATACG 

asp80F (SEQ ID NO:296) 

TACTCGGGCGGCAGAGTCNNSGTAGCAGGACATACGGCC 

asp81F (SEQ ID NO:297) 

TCGGGCGGCAGAGTCCAANNSGCAGGACATACGGCCGCA 

asp82F (SEQ ID NO:298) 



WO 2005/052146 



PCT/US2004/039066 



199 



asp83F 

asp84F 

asp85F 

asp86F 

asp87F 

asp88F 

asp89F 

asp90F 

asp91 F 

asp92F 

asp93F 

asp94F 

asp95F 

asp96F 

asp97F 

asp98F 

asp99F 

asplOOF 

asp101F 

asp102F 

asp103F 

asp104F 

asp105F 

asp106F 

asp107F 
asp108F 



GGCGGCAGAGTCCAAGTANNSGGACATACGGCCGCACCA 
(SEQ ID NO:299) 

GGCAGAGTCCAAGTAGCANNSCATACGGCCGCACCAGTT 
(SEQ ID NO:300) 

AGAGTCCAAGTAGCAGGANNSACGGCCGCACCAGTTGGA 
(SEQ ID NO:301) 

GTCCAAGTAGCAGGACATNNSGCCGCACCAGTTGGATCT 
(SEQ ID NO:302) 

CAAGTAGCAGGACATACGNNSGCACCAGTTGGATCTGCT 
(SEQ ID NO:303) 

GTAGCAG G ACATACG GCCN NSCCAGTTGG ATCTG CTGTA 
(SEQ ID NO:304) 

GCAGGACATACGGCCGCANNSGTTGGATCTGCTGTATGC 
(SEQ ID NO:305) 

GGACATACGGCCGCACCANNSGGATCTGCTGTATGCCGC 
(SEQ ID NO:306) 

CATACGGCCGCACCAGTTNNSTCTGCTGTATGCCGCTCA 
(SEQ ID NO:307) 

ACGGCCGCACCAGTTGGANNSGCTGTATGCCGCTCAGGT 
(SEQ ID NO:308) 

GCCGCACCAGTTGGATCTNNSGTATGCCGCTCAGGTAGC 
(SEQ ID NO:309) 

GCACCAGTTGGATCTGCTNNSTGCCGCTCAGGTAGCACT 
(SEQ ID NO:310) 

CCAGTTGGATCTGCTGTANNSCGCTCAGGTAGCACTACA 
(SEQ ID NO:311) 

GTTG G ATCTGCTGTATGCN NSTCAG GTAGCACTACAG GT 
(SEQ ID NO:312) 

GGATCTGCTGTATGCCGCNNSGGTAGCACTACAGGTTGG 
(SEQ ID NO:313) 

TCTGCTGTATGCCGCTCANNSAGCACTACAGGTTGGCAT 
(SEQ ID NO:314) 

GCTGTATGCCGCTCAGGTNNSACTACAGGTTGGCATTGC 
(SEQ ID NO:315) 

GTATGCCGCTCAGGTAGCNNSACAGGTTGGCATTGCGGA 
(SEQIDNO:316) 

TGCCGCTCAGGTAGCACTNNSGGTTGGCATTGCGGAACT 
(SEQ ID NO:317) 

CGCTCAGGTAGCACTACANNSTGGCATTGCGGAACTATC 
(SEQIDNO:318) 

TC AG GTAGCACTACAG GTN NSC ATTGCGG AACTATCACG 
(SEQIDNO:319) 

GGTAGC ACT ACAGGTTG GN NSTGCGG AACTATCACG G CG 
(SEQ ID NO:320) 

AGCACTACAGGTTGGCATNNSGGAACTATCACGGCGCTG 
(SEQ ID NO:321) 

ACTACAGGTTGGCATTGCNNSACTATCACGGCGCTGAAT 
(SEQ ID NO:322) 

ACAGGTTGGCATTGCGGANNSATCACGGCGCTGAATTCG 
(SEQ ID NO:323) 

GGTTGGCATTGCGGAACTNNSACGGCGCTGAATTCGTCT 
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asp109F 

asp110F 

asp111F 

asp112F 

asp113F 

asp114F 

asp115F 

asp116F 

asp117F 

asp118F 

asp119F 

asp120F 

asp121F 

asp122F 

asp123F 

asp124F 

asp125F 

asp126F 

asp127F 

asp128F 

asp129F 

asp130F 

asp131F 

asp132F 

asp133F 



(SEQ ID NO:324) 

TGGCATTGCGGAACTATCNNSGCGCTGAATTCGTCTGTC 
(SEQ ID NO:325) 

C ATTGCG G AACTATCACG N NSCTG AATTCGTCTGTCACG 
(SEQ ID NO:326) 

TGCGG AACTATCACG GCG N NSAATTCGTCTGTCACGTAT 
(SEQ ID NO:327) 

GGAACTATCACGGCGCTGNNSTCGTCTGTCACGTATCCA 
(SEQ ID NO:328) 

ACTATCACGGCGCTGAATNNSTCTGTCACGTATCCAGAG 
(SEQ ID NO:329) 

ATCACGGCGCTGAATTCGNNSGTCACGTATCCAGAGGGA 
(SEQ ID NO:330) 

ACGGCGCTGAATTCGTCTNNSACGTATCCAGAGGGAACA 
(SEQ ID NO:331) 

GCGCTGAATTCGTCTGTCNNSTATCCAGAGGGAACAGTC 
(SEQ ID NO:332) 

CTGAATTCGTCTGTCACGNNSCCAGAGGGAACAGTCCGA 
(SEQ ID NO:333) 

AATTCGTCTGTCACGTATNNSGAGGGAACAGTCCGAGGA 
(SEQ ID NO:334) 

TCGTCTGTCACGTATCCANNSGGAACAGTCCGAGGACTT 
(SEQ ID NO:335) 

TCTGTCACGTATCCAGAGNNSACAGTCCGAGGACTTATC 
(SEQ ID NO:336) 

GTCACGTATCCAGAGGGANNSGTCCGAGGACTTATCCGC 
(SEQ ID NO:337) 

ACGTATCCAGAGGGAACANNSCGAGGACTTATCCGCACG 
(SEQ ID NO:338) 

TATCCAGAGGGAACAGTCNNSGGACTTATCCGCACGACG 
(SEQ ID NO:339) 

CCAGAGGGAACAGTCCGANNSCTTATCCGCACGACGGTT 
(SEQ ID NO:340) 

GAGGGAACAGTCCGAGGANNSATCCGCACGACGGTTTGT 
(SEQIDNO:341) 

G G AACAGTCCG AG G ACTTN NSCGC ACG ACG GTTTGTGCC 
(SEQ ID NO:342) 

ACAGTCCGAGGACTTATCNNSACGACGGTTTGTGCCGAA 
(SEQ ID NO:343) 

GTCCGAGGACTTATCCGCNNSACGGTTTGTGCCGAACCA 
(SEQ ID NO:344) 

CG AG G ACTTATCCGCACG N N SGTTTGTGCCG AACCAGGT 
(SEQ ID NO:345) 

G G ACTTATCCGCACG ACG NN STGTGCCGAACCAG GTG AT 
(SEQ ID NO:346) 

CTTATCCGCACGACGGTTNNSGCCGAACCAGGTGATAGC 
(SEQ ID NO:347) 

ATCCGCACGACGGTTTGTNNSGAACCAGGTGATAGCGGA 
(SEQ ID NO:348) 

CGCACGACGGTTTGTGCCNNSCCAGGTGATAGCGGAGGT 
(SEQ ID NO:349) 
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asp134F 

asp135F 

asp136F 

asp137F 

asp138F 

asp139F 

asp140F 

asp141F 

asp142F 

asp143F 

asp144F 

asp145F 

aspi46F 

asp147F 

asp148F 

asp149F 

asp150F 

asp151F 

asp152F 

asp153F 

asp154F 

asp155F 

asp156F 

asp157F 

asp158F 
asp159F 



ACGACGGTTTGTGCCGAANNSGGTGATAGCGGAGGTAGC 
(SEQ ID NO:350) 

ACGGTTTGTGCCGAACCANNSGATAGCGGAGGTAGCCTT 

(SEQIDNO:351) 

GTTTGTGCCGAACCAGGTNNSAGCGGAGGTAGCCTTTTA 

(SEQ ID NO:352) 

TGTGCCGAACCAGGTGATNNSGGAGGTAGCCTTTTAGCG 

(SEQ ID NO:353) 

GCCGAACCAGGTGATAGCNNSGGTAGCCTTTTAGCGGGA 

(SEQ ID NO:354) 

GAACCAGGTGATAGCGGANNSAGCCTTTTAGCGGGAAAT 

(SEQ ID NO:355) 

CCAGGTGATAGCGGAGGTNNSCTTTTAGCGGGAAATCAA 
(SEQ ID NO:356) 

GGTGATAGCGGAGGTAGCNNSTTAGCGGGAAATCAAGCC 
(SEQ ID NO:357) 

GATAGCGGAGGTAGCCTTNNSGCGGGAAATCAAGCCCAA 
(SEQ ID NO:358) 

AGCGGAGGTAGCCTTTTANNSGGAAATCAAGCCCAAGGT 
(SEQ ID NO:359) 

GGAGGTAGCCTTTTAGCGNNSAATCAAGCCCAAGGTGTC 
(SEQ ID NO:360) 

GGTAGCCTTTTAGCGGGANNSCAAGCCCAAGGTGTCACG 
(SEQ ID NO:361) 

AGCCTTTTAGCGGGAAATNNSGCCCAAGGTGTCACGTCA 
(SEQ ID NO:362) 

CTTTTAGCGGGAAATCAANNSCAAGGTGTCACGTCAGGT 
(SEQ ID NO:363) 

TTAGCGGGAAATCAAGCCNNSGGTGTCACGTCAGGTGGT 
(SEQ ID NO:364) 

GCGGGAAATCAAGCCCAANNSGTCACGTCAGGTGGTTCT 
(SEQ ID NO:365) 

GGAAATCAAGCCCAAGGTNNSACGTCAGGTGGTTCTGGA 
(SEQ ID NO:366) 

AATCAAGCCCAAGGTGTCNNSTCAGGTGGTTCTGGAAAT 
(SEQ ID NO:367) 

CAAGCCCAAGGTGTCACGNNSGGTGGTTCTGGAAATTGT 
(SEQ ID NO:368) 

GCCCAAGGTGTCACGTCANNSGGTTCTGGAAATTGTCGG 
(SEQ ID NO:369) 

CAAGGTGTCACGTCAGGTNNSTCTGGAAATTGTCGGACG 
(SEQ ID NO:370) 

GGTGTCACGTCAG GTGGTNNSG G AA ATTGTCG G ACG GG G 
(SEQ ID NO:371) 

GTCACGTCAGGTGGTTCTNNSAATTGTCGGACGGGGGGA 
(SEQ ID NO:372) 

ACGTCAGGTGGTTCTGGANNSTGTCGGACGGGGGGAACA 
(SEQ ID NO:373) 

TCAGGTGGTTCTGGAAATNNSCGGACGGGGGGAACAACA 
(SEQ ID NO:374) 

GGTG GTTCTG G AAATTGTNNSACGG G GG G AACAAC ATTC 
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asp160F 
asp161F 
asp162F 
asp163F 
asp164F 
asp165F 
asp166F 
asp167F 
asp168F 
asp169F 
asp170F 
asp171F 
asp172F 
asp173F 
asp174F 
asp175F 
asp176F 
asp177F 
asp178F 
asp179F 
asp180F 
asp181F 
asp182F 
asp183F 
asp184F 



(SEQ ID NO:375) 

GGTTCTGGAAATTGTCGGNNSGGGGGAACAACATTCTTT 
(SEQ ID NO:376) 

TCTG G A AATTGTCG GACGNNSG G AAC AAC ATTCTTTC AA 
(SEQ ID NO:377) 

GGAAATTGTCGGACGGGGNNSACAACATTCTTTCAACCA 
(SEQ ID NO:378) 

AATTGTCGGACGGGGGGANNSACATTCTTTCAACCAGTC 
(SEQ ID NO:379) 

TGTCGGACGGGGGGAACANNSTTCTTTCAACCAGTCAAC 
(SEQ ID NO:380) 

CGGACGGGGGGAACAACANNSTTTCAACCAGTCAACCCG 
(SEQ ID NO:381) 

ACGGGGGGAACAACATTCNNSCAACCAGTCAACCCGATT 
(SEQ ID NO:382) 

GGGGGAACAACATTCTTTNNSCCAGTCAACCCGATTTTG 
(SEQ ID NO:383) 

GGAACAACATTCTTTCAANNSGTCAACCCGATTTTGCAG 
(SEQ ID NO:384) 

AC AACATTCTTTCAACCAN N S AACCCG ATTTTG CAGGCT 
(SEQ ID NO:385) 

ACATTCTTTCAACCAGTCNNSCCGATTTTGCAGGCTTAC 
(SEQ ID NO:386) 

TTCTTTCAACCAGTCAACNNSATTTTGCAGGCTTACGGC 
(SEQ ID NO:387) 

TTTCAACCAGTCAACCCGNNSTTGCAGGCTTACGGCCTG 
(SEQ ID NO:388) 

CAACCAGTCAACCCGATTNNSCAGGCTTACGGCCTGAGA 
(SEQ ID NO:389) 

CCAGTCAACCCGATTTTGNNSGCTTACGGCCTGAGAATG 
(SEQ ID NO:390) 

GTCAACCCGATTTTGCAGNNSTACGGCCTGAGAATGATT 
(SEQ ID NO:391) 

AACCCGATTTTGCAGGCTNNSGGCCTGAGAATGATTACG 
(SEQ ID NO:392) 

CCGATTTTGCAGGCTTACNNSCTGAGAATGATTACGACT 
(SEQ ID NO:393) 

ATTTTGCAGGCTTACGGCNNSAGAATGATTACGACTGAC 
(SEQ ID NO:394) 

TTGCAGGCTTACGGCCTGNNSATGATTACGACTGACTCT 
(SEQ ID NO:395) 

CAGGCTTACGGCCTGAGANNSATTACGACTGACTCTGGA 
(SEQ ID NO:396) 

GCTTACG GCCTG AG AATG N N SACG ACTG ACTCTG G AAGT 
(SEQ ID NO:397) 

TACGGCCTGAGAATGATTNNSACTGACTCTGGAAGTTCC 
(SEQ ID NO:398) 

GGCCTGAGAATGATTACGNNSGACTCTGGAAGTTCCCCT 
(SEQ ID NO:399) 

CTGAGAATGATTACGACTNNSTCTGGAAGTTCCCCTTAA 
(SEQ ID NO:400) 
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aspl 85F 

asp186F 

aspl 87F 

asp188F 

asp189F 

Reverse 
mutagenesis 
primer 

aspIR 

asp2R 

asp3R 

asp4R 

asp5R 

asp6R 

asp7R 

asp8R 

asp9R 

asplOR 

asp11R 

asp12R 

asp13R 

asp14R 

asp15R 

asp16R 

asp17R 

asp18R 
asp19R 



-203 - 

AGAATGATTACGACTGACNNSGGAAGTTCCCCTTAACCC 
(SEQ ID NO:401) 

ATGATTACGACTGACTCTNNSAGTTCCCCTTAACCCAAC 
(SEQ ID NO:402) 

ATTAGG ACTG ACTCTG G AN NSTCCCCTTAACCCAACAG A 
(SEQ ID NO:403) 

ACG ACTG ACTCTG G AAGTN NSCCTTAACCC AACAG AGG A 
(SEQ ID NO:404) 

ACTGACTCTGGAAGTTCCNNSTAACCCAACAGAGGACGG . 
(SEQ ID NO:405) 



DNA sequence, 5'-3' 

GTTGCCTCCAATTACGTCSNNCATCGTTCTAGGCGTTTC 
(SEQ ID NO:406) 

TGCGTTGCCTCCAATTACSNNGAACATCGTTCTAGGCGT 
(SEQ ID NO:407) 

ATATGCGTTGCCTCCAATSNNGTCGAACATCGTTCTAGG 
(SEQ ID NO:408) 

AGTATATGCGTTG CCTCCS NNTACGTCG AACATCGTTCT 
(SEQ ID NO:409) 

AATAGTATATGCGTTGCCSNNAATTACGTCGAACATCGT 
(SEQIDNO:410) 

GCCAATAGTATATGCGTTSNNTCCAATTACGTCGAACAT 
(SEQIDNO:411) 

GCCGCCAATAGTATATGCSNNGCCTCCAATTACGTCGAA 
(SEQ ID NO:412) 

CCGGCCGCCAATAGTATASNNGTTGCCTCCAATTACGTC 
(SEQIDNO:413) 

AGACCGGCCGCCAATAGTSNNTGCGTTGCCTCCAATTAC 
(SEQIDNO:414) 

TCTAGACCGGCCGCCAATSNNATATGCGTTGCCTCCAAT 
(SEQIDNO:415) 

ACATCTAGACCGGCCGCCSNNAGTATATGCGTTGCCTCC 
(SEQIDNO:416) 

AGAACATCTAGACCGGCCSNNAATAGTATATGCGTTGCC 
(SEQIDNO:417) 

GATAGAACATCTAGACCGSNNGCCAATAGTATATGCGTT 
(SEQIDNO:418) 

TCCGATAGAACATCTAGASNNGCCGCCAATAGTATATGC 
(SEQ ID NO:419) 

GAATCCGATAGAACATCTSNNCCGGCCGCCAATAGTATA 
(SEQ ID NO:420) 

TGCGAATCCGATAGAACASNNAGACCGGCCGCCAATAGT 
(SEQ ID NO:421) 

TACTGCGAATCCGATAGASNNTCTAGACCGGCCGCCAAT 
(SEQ ID NO:422) 

GTTTACTGCGAATCCGATSNNACATCTAGACCGGCCGCC 
(SEQ ID NO:423) 

ACCGTTTACTGCGAATCCSNNAGAACATCTAGACCGGCC 
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(SEQ ID NO:424) 

GCCACCGTTTACTGCGAASNNGATAGAACATCTAGACCG 

asp20R (SEQ ID NO:425) 

G AAGCCACCGTTTACTG cs nntccg atag aacatctag a 
asp21R (SEQ ID NO:426) 

AATGAAGCCACCGTTTACSNNGAATCCGATAGAACATCT 

asp22R (SEQ ID NO:427) 

AGTAATGAAGCCACCGTTSNNTGCGAATCCGATAGAACA 

asp23R (SEQ ID NO:428) 

GGCAGTAATGAAGCCACCSNNTACTGCGAATCCGATAGA 

asp24R (SEQ ID NO:429) 

ACCGGCAGTAATGAAGCCSNNGTTTACTGCGAATCCGAT 

asp25R (SEQ ID NO:430) 

GTGACCGGCAGTAATGAASNNACCGTTTACTGCGAATCC 
asp26R (SEQIDNO:431) 

GCAGTGACCGGCAGTAATSNNGCCACCGTTTACTGCGAA 
asp27R (SEQ ID NO:432) 

TCCGCAGTGACCGGCAGTSNNGAAGCCACCGTTTACTGC 

asp28R (SEQ ID NO:433) 

TCTTCCGCAGTGACCGGCSNNAATGAAGCCACCGTTTAC 

asp29R (SEQ ID NO:434) 

TGTTCTTCCGCAGTGACCSNNAGTAATGAAGCCACCGTT 

asp30R (SEQ ID NO:435) 

TCCTGTTCTTCCGCAGTGSNNGGCAGTAATGAAGCCACC 

asp31 R (SEQ ID NO:436) 

GGCTCCTGTTCTTCCGCASNNACCGGCAGTAATGAAGCC 

asp32R (SEQ ID NO:437) 

AGTGGCTCCTGTTCTTCCSNNGTGACCGGCAGTAATGAA 

asp33R (SEQ ID NO:438) 

AGTAGTGGCTCCTGTTCTSNNGCAGTGACCGGCAGTAAT 

asp34R (SEQ ID NO:439) 

GGCAGTAGTGGCTCCTGTSNNTCCGCAGTGACCGGCAGT 

asp35R (SEQ ID NO:440) 

ATTGGCAGTAGTGGCTCCSNNTCTTCCGCAGTGACCGGC 

asp36R (SEQIDNO:441) 

CGGATTGGCAGTAGTGGCSNNTGTTCTTCCGCAGTGACC 

asp37R (SEQ ID N0:442) 

AGTCGGATTGGCAGTAGTSNNTCCTGTTCTTCCGCAGTQ 

asp38R (SEQ ID NO:443) 

GCCAGTCGGATTGGCAGTSNNGGCTCCTGTTCTTCCGCA 

asp39R (SEQ ID NO:444) 

TGTGCCAGTCGGATTGGCSNNAGTGGCTCCTGTTCTTCC 

asp40R (SEQ ID NO:445) 

AAATGTGCCAGTCGGATTSNNAGTAGTGGCTCCTGTTCT 

asp41 R (SEQ ID NO:446) 

TGCAAATGTGCCAGTCGGSNNGGCAGTAGTGGCTCCTGT 

asp42R (SEQ ID NO:447) 

ACCTGCAAATGTGCCAGTSNNATTGGCAGTAGTGGCTCC 

asp43R (SEQ ID NO:448) 

GCTACCTGCAAATGTGCCSNNCGGATTGGCAGTAGTGGC 

asp44R (SEQ ID NO:449) 
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CGAGCTACCTGCAAATGTSNNAGTCGGATTGGCAGTAGT 


asp45R 


(SEQ ID NO:450) 




AAACGAGCTACCTGCAAASNNGCCAGTCGGATTGGCAGT 


asp46R 


(SEQ ID NO:451) 




CGGAAACGAGCTACCTGCSNNTGTGCCAGTCGGATTGGC 


asp47R 


(SEQ ID NO:452) 




TCCCGGAAACGAGCTACCSNNAAATGTGCCAGTCGGATT 


asp48R 


(SEQ ID NO:453) 




ATTTCCCGGAAACGAGCTSNNTGCAAATGTGCCAGTCGG 


asp49R 


(SEQ ID NO:454) 




ATCATTTCCCGGAAACGASNNACCTGCAAATGTGCCAGT 


asp50R 


(SEQ ID NO:455) 




ATAATCATTTCCCGG AAASN NG CTACCTGCAAATGTGCC 


asp51R 


(SEQ ID NO:456) 




TG CATAATC ATTTCCCG G SN NCG AG CTACCTGCAAATGT 


asp52R 


(SEQ ID NO:457) 




GAATGCATAATCATTTCCSNNAAACGAGCTACCTGCAAA 


asp53R 


(SEQ ID NO:458) 




GACGAATGCATAATCATTSNNCGGAAACGAGCTACCTGC 


asp54R 


(SEQ ID NO:459) 




TCGGACGAATGCATAATCSNNTCCCGGAAACGAGCTACC 


asp55R 


(SEQ ID NO:460) 




TGTTCGGACGAATGCATASNNATTTCCCGGAAACGAGCT 


asp56R 


(SEQ ID NO:461) 




CCCTGTTCGGACGAATGCSNNATCATTTCCCGGAAACGA 


asp57R 


(SEQ ID NO:462) 




TGCCCCTGTTCGGACGAASNNATAATCA I I I CCCGGAAA 


asp58R 


(SEQ ID NO:463) 




TCCTGCCCCTGTTCGGACSNNTGCATAATCATTTCCCGG 


asp59R 


(SEQ ID NO:464) 




TACTCCTGCCCCTGTTCGSNNGAATGCATAATCATTTCC 


asp60R 


(SEQ ID NO:465) 




ATTTACTCCTGCCCCTGTSNNGACGAATGCATAATCATT 


asp61 R 


(SEQ ID NO:466) 




CAAATTTACTCCTGCCCCSNNTCGGACGAATGCATAATC 


asp62R 


(SEQ ID NO:467) 




AAGCAAATTTACTCCTGCSNNTGTTCGGACGAATGCATA 


asp63R 


(SEQ ID NO:468) 




GGCAAG CAAATTTACTCCSN NCCCTGTTCG GACG AATGC 


asp64R 


(SEQ ID NO:469) 




TTGGGCAAGCAAATTTACSNNTGCCCCTGTTCGGACGAA 


asp65R 


(SEQ ID NO:470) 




GACTTGGGCAAGCAAATTSNNTCCTGCCCCTGTTCGGAC 


asp66R 


(SEQ ID NO:471) 




ATTGACTTGGGCAAGCAASNNTACTCCTGCCCCTGTTCG 


asp67R 


(SEQ ID NO:472) 




GTTATTGACTTGGGCAAGSNNATTTACTCCTGCCCCTGT 


asp68R 


(SEQ ID NO:473) 




GTAGTTATTGACTTGGGCSNNCAAA I I IACTCCTGCCCC 


asp69R 


(SEQ ID NO:474) 


asp70R 


CGAGTAGTTATTGACTTGSNNAAGCAAATTTACTCCTGC 
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(SEQ ID NO:475) 




GCCCGAGTAGTTATTGACSNNGGCAAGCAAATTTACTCC 


asp71R 


(SEQ ID NO:476) 


GCCGCCCGAGTAGTTATTSNNTTGGGCAAGCAAAI I I AC 


asp72R 


(SEQ ID NO:477) 




TCTGCCGCCCGAGTAGTTSNNGACTTGGGCAAGCAAATT 


asp73R 


(SEQ ID NO:478) 




GACTCTGCCGCCCGAGTASNNATTGACTTGGGCAAGCAA 


asp74R 


(SEQ ID NO:479) 




TTGGACTCTGCCGCCCGASNNGTTATTGACTTGGGCAAG 


asp75R 


(SEQ ID NO:480) 


TACTTGGACTCTGCCGCCSNNGTAGTTATTGACTTGGGC 


asp76R 


(SEQ ID NO:481) 




TGCTACTTGGACTCTGCCSNNCGAGTAGTTATTGACTTG 


asp77R 


(SEQ ID NO:482) 




TCCTGCTACTTGGACTCTSNNGCCCGAGTAGTTATTGAC 


asp78R 


(SEQ ID NO:483) 




ATGTCCTGCTACTTGGACSNNGCCGCCCGAGTAGTTATT 


asp79R 


(SEQ ID N0:484) 




CGTATGTCCTGCTACTTGSNNTCTGCCGCCCGAGTAGTT 


asp80R 


(SEQ ID NO:485) 




GGCCGTATGTCCTGCTACSNNGACTCTGCCGCCCGAGTA 


asp81R 


(SEQ ID NO:486) 


TGCGGCCGTATGTCCTGCSNNTTGGACTCTGCCGCCCGA 


asp82R 


(SEQ ID NO:487) 




TGGTGCGGCCGTATGTCCSNNTACTTGGACTCTGCCGCC 


asp83R 


(SEQ ID NO:488) 




AACTGGTGCGGCCGTATGSNNTGCTACTTGGACTCTGCC 


asp84R 


(SEQ ID NO:489) 


TCCAACTGGTGCGGCCGTSNNTCCTGCTACTTGGACTCT 


asp85R 


(SEQ ID NO:490) 




AGATCCAACTGGTGCGGCSNNATGTCCTGCTACTTGGAC 


asp86R 


(SEQIDNO:491) 




AGCAGATCCAACTGGTGCSNNCGTATGTCCTGCTACTTG 


asp87R 


(SEQ ID NO:492) 


TACAGCAGATCCAACTGGSNNGGCCGTATGTCCTGCTAC 


asp88R 


(SEQ ID NO:493) 




GCATACAGCAGATCCAACSNNTGCGGCCGTATGTCCTGO 


asp89R 


(SEQ ID NO:494) 


GCGGCATACAGCAGATCCSNNTGGTGCGGCCGTATGTCC 


asp90R 


(SEQ ID NO:495) 




TGAGCGGCATACAGCAGASNNAACTGGTGCGGCCGTATG 


asp91R 


(SEQ ID NO:496) 


ACCTGAGCGGCATACAGCSNNTCCAACTGGTGCGGCCGT 


asp92R 


(SEQ ID NO:497) 


GCTACCTGAGCGGCATACSNNAGATCCAACTGGTGCGGC 


asp93R 


(SEQ ID NO:498) 


AGTGCTACCTGAGCGGCASNNAGCAGATCCAACTGGTGC 


asp94R 


(SEQ ID NO:499) 


TGTAGTGCTACCTGAGCGSNNTACAGCAGATCCAACTGG 


asp95R 


(SEQ ID NO:500) 
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asp96R 

asp97R 

asp98R 

asp99R 

asplOOR 

asp101R 

asp102R 

asp103R 

asp104R 

asp105R 

asp106R 

asp107R 

asp108R 

asp109R 

asp110R 

asp111R 

asp112R 

asp113R 

aspl 14R 

asp115R 

asp116R 

asp117R 

asp118R 

asp119R 

asp120R 
asp121R 



ACCTGTAGTGCTACCTGASNNGCATACAGCAGATCCAAC 
(SEQIDNO:501) 

CCAACCTGTAGTGCTACCSNNGCGGCATACAGCAGATCC 
(SEQ ID NO:502) 

ATG CCAACCTGTAGTGCTSNNTG AG CG G CATACAGCAG A 
(SEQ ID NO:503) 

GCAATGCCAACCTGTAGTSNNACCTGAGCGGCATACAGC 
(SEQ ID NO:504) 

TCCGCAATGCCAACCTGTSNNGCTACCTGAGCGGCATAC 
(SEQ ID NO:505) 

AGTTCCGCAATGCCAACCSNNAGTGCTACCTGAGCGGCA 
(SEQ ID NO:506) 

GATAGTTCCGCAATGCCASNNTGTAGTGCTACCTGAGCG 
(SEQ ID NO:507) 

CGTGATAGTTCCGCAATGSNNACCTGTAGTGCTACCTGA 
(SEQ ID NO:508) 

CGCCGTGATAGTTCCGCASNNCCAACCTGTAGTGCTACC 
(SEQ ID NO:509) 

CAGCGCCGTGATAGTTCCSNNATGCCAACCTGTAGTGCT 
(SEQ ID NO:510) 

ATTCAGCGCCGTGATAGTSNNGCAATGCCAACCTGTAGT 
(SEQIDNO:511) 

CGAATTCAGCGCCGTGATSNNTCCGCAATGCCAACCTGT 
(SEQ ID NO:512) 

AGACGAATTCAGCGCCGTSNNAGTTCCGCAATGCCAACC 
(SEQIDNO:513) 

GACAGACGAATTCAGCGCSNNGATAGTTCCGCAATGCCA 
(SEQ ID NO:514) 

CGTGACAGACGAATTCAGSNNCGTGATAGTTCCGCAATG 
(SEQ ID NO:515) 

ATACGTGACAGACGAATTSNNCGCCGTGATAGTTCCGCA 
(SEQ ID NO:516) 

TGGATACGTGACAGACGASNNCAGCGCCGTGATAGTTCC 
(SEQ ID NO:517) 

CTCTGGATACGTGACAGASNNATTCAGCGCCGTGATAGT 
(SEQ ID NO:518) 

TCCCTCTGGATACGTGACSNNCGAATTCAGCGCCGTGAT 
(SEQ ID NO:519) 

TGTTCCCTCTGGATACGTSNNAGACGAATTCAGCGCCGT 
(SEQ ID NO:520) 

GACTGTTCCCTCTGGATASNNGACAGACGAATTCAGCGC 
(SEQIDNO:521) 

TCGGACTGTTCCCTCTGGSNNCGTGACAGACGAATTCAG 
(SEQ ID NO:522) 

TCCTCGGACTGTTCCCTCSNNATACGTGACAGACGAATT 
(SEQ ID NO:523) 

AAGTCCTCGGACTGTTCCSNNTGGATACGTGACAGACGA 
(SEQ ID NO:524) 

GATAAGTCCTCGGACTGTSNNCTCTGGATACGTGACAGA 
(SEQ ID NO:525) 

GCGGATAAGTCCTCGGACSNNTCCCTCTGGATACGTGAC 
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(SEQ ID NO:526) 




CGTGCGGATAAGTCCTCGSNNTGTTCCCTCTGGATACGT 


asp122R 


(SEQ ID NO:527) 




CGTCGTGCGGATAAGTCCSNNGACTGTTCCCTCTGGATA 


asp123R 


(SEQ ID NO:528) 




AACCGTCGTGCGGATAAGSNNTCGGACTGTTCCCTCTGG 


asp124R 


(SEQ ID NO:529) 




ACAAACCGTCGTGCGGATSNNTCCTCGGACTGTTCCCTC 


asDl25R 


(SEQ ID NO:530) 




GG CAC AAACCGTCGTGCG SN NAAGTCCTCG G ACTGTTCC 


aso126R 


(SEQ ID NO:531) 


TTCGGCACAAACCGTCGTSNNGATAAGTCCTCGGACTGT 

| | X^ XVl XVl X^# »X^# *# W %X^X^X^ ■ X^ X^ ■ Xw» m \ ■ # w » -^^i ■ ■ ■ m 


aso127R 


(SEQ ID NO:532) 




TGGTTCGGCACAAACCGTSNNGCGGATAAGTCCTCGGAC 


aso128R 

V**X/N^ ■ *^ ■ » 


(SEQ ID NO:533) 




ACCTGGTTCGGCACAAACSNNCGTGCGGATAAGTCCTCG 

f \ X^ X^ I 1 1 X^ X^ X^ X^# » X^# w v wX^X^I ^ ■ ■ X^ X^ ■ x^ X^ X^ m ■ # u ■ -^mw ■ 


aso129R 


(SEQ ID NO:534) 




ATCACCTGGTTCGGCACASNNCGTCGTGCGGATAAGTCC 


aso1 30R 


(SEQ ID NO:535) 




GCTATCACCTGGTTCGGCSNNAACCGTCGTGCGGATAAG 

X^ 1 9 \ 1 X^/ \ | >w4 1 1 X^ Xb*4 X** X^ X^ ■ ~ ■ ^# ** »X^ X^ X^ V X^ X^ ■ X^ X^ X^ X^* » ■ » w » X^ 


aso131 R 

UWw 1 III 


(SEQ ID NO:536) 




TCCGCTATCACCTGGTTCSNNACAAACCGTCGTGCGGAT 


aso132R 


(SEQ ID NO:537) 




ACCTCCGCTATCACCTGGSNNGGCACAAACCGTCGTGCG 

X^ X*^ 1 x^Vb^X^Ix^ 1 ' * 1 X^# %X^X^ ■ X^X^X^I ^1 1 w X« X^ X^# »x^# »* w »x^x^x^ i x^ x^ ■ 


asDl33R 


(SEQ ID NO:538) 




GCTACCTCCGCTATCACCSNNTTCGGCACAAACCGTCGT 

X*4 X^ 1 / »X^ X^ 1 X^ X^ X^ X^ 1 / » t X^# 1 X^ X^X^ 1 ^ ■ v I 9 X^ X^ X^ Vrr » X*#* %# •# »X^ X^ X^ ■ • 


aso134R 


(SEQ ID NO:539) 




AAGGCTACCTCCGCTATCSNNTGGTTCGGCACAAACCGT 

9 \§ \. X^ XbM X^ 1 # %X^X^ ■ V/V/ M V 1 • 1 1 X^x^i v ■ x^ X^ ■ ■ x^ x^ x^* * * *• * ^ ■ 


aso135R 


(SEQ ID NO:540) 




TAAAAGGCTACCTCCGCTSNNACCTGGTTCGGCACAAAC 


asp136R 


(SEQ ID NO:541) 




CGCTAAAAGGCTACCTCCSNNATCACCTGGTTCGGCACA 


asp137R 


(SEQ ID NO:542) 




TCCCGCTAAAAGGCTACCSNNGCTATCACCTGGTTCGGC 


asp138R 


(SEQ ID NO:543) 




ATTTCCCGCTAAAAGGCTSNNTCCGCTATCACCTGGTTC 

m » III v W\« V 1 # *# »x^X^X^ * x^« » ■ x^ x^ x^ x^ ■ m m ■ »^^^^ » ■ ■ 


asp139R 


(SEQ ID NO:544) 




TTGATTTCCCGCTAAAAGSNNACCTCCGCTATCACCTGG 

| | Xb*4 9 1 III X^X^X^X^X^ XT »X^X^» ~ • ' * »X^ X^ • X^ » m » ■ * » ■ , 


asp140R 


(SEQ ID NO:545) 


GGCTTGATTTCCCGCTAASNNGCTACCTCCGCTATCACC 


asp141R 


(SEQ ID NO:546) 


TTGGGCTTGATTTCCCGCSNNAAGGCTACCTCCGCTATC 


asp142R 


(SEQ ID NO:547) 


ACCTTGGGCTTGATTTCCSNNTAAAAGGCTACCTCCGCT 


asp143R 


(SEQ ID NO:548) 


GACACCTTGGGCTTGATTSNNCGCTAAAAGGCTACCTCC 


asp144R 


(SEQ ID NO:549) 


CGTGACACCTTGGGCTTGSNNTCCCGCTAAAAGGCTACC 


asp145R 


(SEQ ID NO:550) 


TGACGTGACACCTTGGGCSNNATTTCCCGCTAAAAGGCT 


asp146R 


(SEQIDNO:551) 
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ACCTGACGTGACACCTTGSNNTTGATTTCCCGCTAAAAG 


asp147R 


(SEQ ID NO:552) 




ACCACCTGACGTGACACCSNNGGCTTGATTTCCCGCTAA 


asp148R 


(SEQ ID NO:553) 




AGAACCACCTGACGTGACSNNTTGGGCTTGATTTCCCGC 


asp149R 


(SEQ ID NO:554) 




TCCAGAACCACCTGACGTSNNACCTTGGGCTTGATTTCC 


asplSOR 


(SEQ ID NO:555) 




ATTTCCAGAACCACCTGASNNGACACCTTGGGCTTGATT 


asp151R 


(SEQ ID NO:556) 




ACAATTTCCAGAACCACCSNNCGTGACACCTTGGGCTTG 


asp152R 


(SEQ ID NO:557) 




CCGACAATTTCCAGAACCSNNTGACGTGACACCTTGGGC 


asp153R 


(SEQ ID NO:558) 




CGTCCGACAATTTCCAGASNNACCTGACGTGACACCTTG 


asp!54R 


(SEQ ID NO:559) 




CCCCGTCCGACAATTTCCSNNACCACCTGACGTGACACC 


asp155R 


(SEQ ID NO:560) 




TCCCCCCGTCCGACAATTSNNAGAACCACCTGACGTGAC 


asp156R 


(SEQ ID NO:561) 




TGTTCCCCCCGTCCGACASNNTCCAGAACCACCTGACGT 


asp157R 


(SEQ ID NO:562) 




TGTTGTTCCCCCCGTCCGSNNATTTCCAGAACCACCTGA 


asp158R 


(SEQ ID NO:563) 




GAATGTTGTTCCCCCCGTSNNACAATTTCCAGAACCACC 


asp159R 


(SEQ ID NO:564) 




AAAGAATGTTGTTCCCCCSNNCCGACAA 1 1 ICCAGAACC 


aspl 60R 


(SEQ ID NO:565) 




TTGAAAGAATGTTGTTCCSNNCGTCCGACAATTTCCAGA 


asp!61R 


(SEQ ID NO:566) 




TGGTTGAAAGAATGTTGTSNNCCCCGTCCGACAATTTCC 


aspl 62R 


(SEQ ID NO:567) 




GACTGGTTGAAAGAATGTSNNTCCCCCCGTCCGACAATT 


asp163R 


(SEQ ID NO:568) 




GTTGACTGGTTGAAAGAASNNTGTTCCCCCCGTCCGACA 


aspl 64R 


(SEQ ID NO:569) 




CGGGTTGACTGGTTGAAASNNTGTTGTTCCCCCCGTCCG 


asp165R 


(SEQ ID NO:570) 




AATCGGGTTGACTGGTTGSNNGAATGTTGTTCCCCCCGT 


asp166R 


(SEQ ID NO:571) 




CAAAATCGGGTTGACTGGSNNAAAGAATGTTGTTCCCCC 


asp167R 


(SEQ ID NO:572) 


CTGCAAAATCGGGTTGACSNNTTGAAAGAATGTTGTTCC (SEQ 


asp168R 


ID NO:573) 




AGCCTGCAAAATCGGGTTSNNTGGTTGAAAGAATGTTGT (SEQ 


asp169R 


ID NO:574) 


GTAAGCCTGCAAAATCGGSNNGACTGGTTGAAAGAATGT 


asp170R 


(SEQ ID NO:575) 


G CCGTAAGCCTGCAAAATSN N GTTG ACTG GTTG AAAG AA 


asp171R 


(SEQ ID NO:576) 


asp172R 


CAGGCCGTAAGCCTGCAASNNCGGGTTGACTGGTTGAAA 
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^OEIVj 1 U INVV.O/ / ) 




1 O I OAouoOb 1 nAbUU I VaolNlNMM 1 OVavava I i umu i vava i i va 


dbp I / ori 


/ccn in Mpv^"7*tt 

IU INVJ.O/O^ 




OA 1 I v 1 OAvavaOOva 1 AAvaOolNpJOAMMM I ova vava i i vaMO i vava 


asp174R 


/oca in Mn*R7Q\ 




A ATAATTPTPA^^rr^TACMMPTriPAAAATP^^^TT^An 
AA 1 OA 1 1 O 1 OAvavaOOva 1 AolNIINO I vaOAMMM I ovavava i i vaMO 


asp175R 


/cirp in Mn«Kfln\ 

^otVjl IU rMO.OoU^ 




Opta a to A TTPTP A /^^P* Q M M A PPPTP PAAA ATPP P PTT 
Ova 1 AA 1 OA 1 1 O 1 OAVavaOOolNlxlAvaOO 1 uUMMMM i ovavava i i 


asp176R 


\OtvJ IU rMO.OO 1 ) 




APTPPTA ATPATTPTPA^CMM^TAAriPPTrSPAAAATPrnfi 
Ava 1 Ova 1 AA 1 OA 1 I O 1 OAvaolNiMva 1 AAVaOO 1 vaOMMMM I ovava 


asp177R 






PTP A /^TPCT A ATP ATTPTQMM^PP^T A A^PPT^P A A A AT 
Va 1 OAVa 1 Ova 1 AA 1 OA 1 1 O 1 oININvaOOva 1 AMVaOO l UUrtnMM i 


asp178R 


^OtZVJ 1 U InVJ.OOO/ 


ApA PTPAPTP^iTA ATP* ATQMMP APPPPPTA APPPTPiPlAA 
AvaAva 1 OAVa I Ova 1 AA 1 OA 1 orMNOAvavaOOVa l MMVav-A-f I vaoMM 


asp179R 


^qpp in Mn-f;R/\ 

^OtlVjJ IU IW.OOH) 


TPP A PAPTPAPTPP.TA ATQMMTPTP APPPPPTA APPPTP 

1 OOAvaAva 1 OAVa 1 Ova 1 AA 1 olMIN I O 1 OAVaVaOOVa 1 MMvav^vv i va 


asp180R 


/otrp in mpvcqc\ 


A PTTPP A C A <^TP A r^TP ^TQ M MP ATTPTP A ^PPfiT A A fiP 
AO 1 1 OOAvaAva I OAVa 1 Ova l oi\ii\iv«/M i i u i v-rMvavaov^va i MMvav-* 


asp181R 


/QCP in MP-RQft\ 
\otVj IU INO.DoO^ 


P P A A PTTPP APIA PTP A P.TQM M A ATP ATTPTP A P5 P^PPP^TA 
VaVaAAO 1 1 OOAVaAVa 1 OAva 1 oINlMAA I OA I IUI v-fMVavav-/V-/va I m 


asp182R 


(OCVJ IU l\IO.OO/J 


A/^r^OP A APTTPP AP APTPCMMPPTA ATP ATTPTPAPPPP 

Ava va vava AAO 1 1 OOAvaAva 1 OolNIMOVa I AM I OA I lui OMvavav-rO 


aspi83R 


^OtVJ IU IMVJ.OOOy 


TTA A PPPP A A PTTPP A P AQMM APTPPTA ATP ATTPTP A Pi 
I ! AAvavavaVaAAO 1 1 OOAvaAolMINAVa 1 Ova i AM I OA l I O I oMVa 


asp184R 


^ofcvj iu iNVj.ooyj 


PPPTTA APPPP A APTTPPQMMPTP APTPPTA ATPATTPT 
Vavava 1 1 AAvavavaVaAAO 1 1 OOolMlMVa I OAva I ova 1 mm i On 1 ivi 


asp185R 


(SEQ ID NO:590) 


GTTGGGTTAAGGGGAACTSNNAGAGTCAGTCGTAATCAT 


asp186R 


(SEQ ID NO:591) 


TCTGTTGGGTTAAGGGGASNNTCCAGAGTCAGTCGTAAT 


asp!87R 


(SEQ ID NO:592) 


TCCTCTGTTGGGTTAAGGSNNACTTCCAGAGTCAGTCGT 


asp188R 


(SEQ ID NO:593) 


CCGTCCTCTGTTGGGTTASNNGGAACTTCCAGAGTCAGT 


asp189R 


(SEQ ID NO:594) 




EXAMPLE 16 



Construction of Arginine and Cysteine Combinatorial Mutants 

In this Example, the construction of multiple arginine and cysteine mutants of ASP 
described. These experiments were conducted in order to determine whether the use of 
surface arginine and cysteine combinatorial libraries would lead to mutants with increased 
expression at the protein level. 

The QuikChange® multi site-directed mutagenesis (QCMS) kit (Stratagene) was 
used to construct the two libraries. The 5* phosphorylated primers used to create the two 
libraries are shown in Table 16-1 . It was noted that HPLC, PAGE or any other type of 



WO 2005/052146 



PCT/US2004/039066 



-211- 

purified primers gave far better results in terms of incorporation of full length primers as well 
as significant reduction in primer-containing errors. However, in these experiments, purified 
primers were not used, probably resulting in the production of 12% of clones had undesired 
mutations. 



Table 16-1. Primers and Sequences 


Primer name 


Primer sequence 


ASPR14L 

ASPR16Q 

ASPR35F 

ASPR61S 

ASPR79T 

ASPR123L 

ASPR127Q 

ASPR159Q 

ASPR179Q 


gcatatactattggcggcctgtctagatgttctatcgga (SEQ ID NO:595) 
actattggcggccggtctcagtgttctatcggattcgc (SEQ ID NO:596) 
ctgccggtcactgcggatttacaggagccactactgc (SEQ ID NO:597) 
atgattatgcattcgtctcaacaggggcaggagtaaat (SEQ ID NO:598) 
ataactactcgggcggcacagtccaagtagcaggacatac (SEQ ID NO:599) 
atccagagggaacagtcctgggacttatccgcacgac (SEQ ID NO:600) 
cagtccgaggacttatccagacgacggtttgtgccgaac (SEQ ID NO:601) 
gtggttctggaaattgtcagacggggggaacaacattc (SEQ ID NO:602) 
tqcaggcttacqqcctgcagatqattacqactgactc (SEQ ID NO:603) 


ASPC17S 

ASPC33S 

ASPC95S 

ASPC105S 

ASPC131S 

ASPC158S 


ttggcggccggtctagatcatctatcggattcgcagta (SEQ ID NO:604) 
tcattactgccggtcactcaggaagaacaggagccact (SEQ ID NO:605) 
cagttggatctgctgtatctcgctcaggtagcactac (SEQ ID NO: 606) 
cactacaggttggcattcaggaactatcacggcgctg (SEQ ID NO:607) 
cttatccgcacgacggtttcagccgaaccaggtgatag (SEQ ID NO:608) 
caqqtqqttctqaaaattcacgqacgggggqaacaac (SEQ ID NO:609) 


ASPSEQF1 
ASPSEQF4 
ASPSEQR4 


tgcctcacatttgtgccac (SEQ ID NO:610) 
caggatgtagctgcaggac (SEQ ID NO:611) 
ctcqqttatqaqttaqttc (SEQ ID NO:612) 



pHPLT-ASP-C1-2 Plasmid Preparation and In vitro Methylation 

To construct the cysteine and arginine libraries using the QCMS kit, the template 
plasmid pHPLT-ASP-C1-2 was first methylated in vitro since it was derived from a Bacillus 
strain that does not methylate DNA at GATC sites. This method was used because the 
more common approach of ensuring methylation in plasmids used in the QCMS protocol 
involving deriving DNA from dam+ E. coli strains was not an option here, because the 
plasmid pHPLT-ASP-C1-2 does not grown in E. coli. 

Miniprep DNA was prepared from Bacillus cells harboring the pHPLT-ASP-C1-2 
plasmid. Specifically, the strain was grown overnight in 5 mL of LB withlOppm of neomycin, 
after which the cells were spun down. The Qiagen spin miniprep DNA kit was used for 
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preparing the plasmid DNA with an additional step wherein 100uL of 10mg/mL lysozyme 
was added after the addition of 250uL of P1 buffer from the kit. The sample was incubated 
at 37°C for 15 min with shaking, after which the remaining steps outlined in the Qiagen 
miniprep kit manual were carried out. The miniprep DNA was eluted with 30uL of Qiagen 
buffer EB provided in the kit. 

Next, the pHPLT-ASP-C1-2 plasmid DNA was methylated in vitro using a dam 
methylase kit from NEB (NEB catalog # M0222S). Briefly, 25pL of the miniprep DNA (about 
1-2 jjg) was incubated with 20|jL of the 10x NEB dam methylase buffer, 0.5|jL of S- 
adenosylmethionine (80|jM), 4pL of the dam methylase and 150.5jjL of sterile distilled 
water. The reaction was incubated at 37°C for 4 hours, after which the DNA was purified 
using a Qiagen PCR purification kit. The methylated DNA was eluted with 40pL of buffer EB 
provided in the kit. To confirm methylation of the DNA, 4pL of the purified, methylated DNA 
was digested with Mbol (NEB; this enzyme cuts unmethylated GATC sites) or Dpn\ (Roche; 
this enzyme cuts methylated GATC sites) in a 20pL reaction using 2|jL of each enzyme. 
The reactions were incubated at 37°C for 2 hours and they were analyzed on a 1.2% E-gel 
(Invitrogen). A small molecular weight DNA smear/ladder was observed for the Dpn\ digest, 
whereas the Mbo\ digest showed intact DNA, which indicated that the pHPLT-ASP-C1-2 
plasmid was successfully methylated. 

Library Construction 

The cysteine (cys) and arginine (arg) combinatorial libraries were constructed as 
outlined in the Stratagene QCMS kit, with the exception of the primer concentration used in 
the reactions. Specifically, 4]jL of the methylated, purified pHPLT-ASP-C1-2 plasmid (about 
25 to 50ng) was mixed with 15jjL of sterile distilled water, 1 .5|jL of dNTP, 2.5pL of 10x 
buffer, 1|jL of the enzyme blend and 1.0pL arginine or cysteine mutant primer mix (i.e., for a 
total oflOOng of primers). The primer mix was prepared using 10|jL of each of the nine 
arginine primers (100ng/pL) or each of the six cysteine primers (100ng/|jL); adding 50ng of 
each primer for both the arg and cys libraries as recommended in the Stratagene manual 
resulted in less than 50% of the clones containing mutations in a previous round of 
mutagenesis. Thus, the protocol was modified in the present round of mutagenesis to 
include a total of 100ng of primers in each reaction. The cycling conditions were 95°C for 1 
min, followed by 30 cycles of 95°C for 1 min, 55°C for 1 min, and 65°C for 9 min, in an MJ 
Research thermocycler using thin-walled 0.2mL PCR tubes. The reaction product was 
digested with 1pL of Dpn\ from the QCMS kit by incubating at 37°C overnight. An additional 
0.5pL of Dpn\ was added, and the reaction was incubated for 1 hour. 
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To transform the library DNA directly into Bacillus cells with out going through E. coll, 
the library DNA (single-stranded QCMS product) was amplified using the TempliPhi kit 
(Amersham cat. #25-6400), because Bacillus requires double-stranded multimeric DNA for 
transformation. For this purpose, 1pL of the arginine or cysteine QCMS reaction was mixed 
with 5mL of sample buffer from the TempliPhi kit and heated for 3 minutes at 95°C to 
denature the DNA. The reaction was placed on ice to cool for 2 minutes and then spun 
down briefly. Next, 5pL of reaction buffer and 0.2[jL of phi29 polymerase from the 
TempliPhi kit were added, and the reactions were incubated at 30°C in an MJ Research 
PCR machine for 4 hours. The phi29 enzyme was heat inactivated in the reactions by 
incubation at 65°C for 10 min in the PCR machine. 

For transformation of the libraries into Bacillus, 0.5|jL of the TempliPhi amplification 
reaction product was mixed with 100|jL of comK competent cells followed by vigorous 

shaking at 37°C for 1 hour. The transformation was serially diluted up to 10 fold, and 50pL 
of each dilution was plated on LA plates containing 10 ppm neomycin and 1.6% skim milk. 
Twenty-four clones from each library were picked for sequencing. Briefly, the colonies were 
resuspended in 20jjL of sterile distilled water and 2pL was then used for PCR with 
ReadyTaq beads (Amersham) in a total volume of 25pL. Primers ASPF1 and ASPR4 were 
added at a concentration of 0.5|jM. Cycling conditions were 94°C for 4 min once, followed 
by 30 cycles of 94°C for 1min, 55°C for 1 min, and 72°C for 1min, followed by one round at 
72°C for 7 min. A 1.5kb fragment was obtained in each case and the product was purified 
using a Qiagen PCR purification kit. The purified PCR products were sequenced with 
ASPF4 and ASPR4 primers. 

A total of 48 clones were sequenced (24 from each library). The mutagenesis 
worked quite well in that only about 15% of the clones were WT. But 20% of the clones had 
mixed sequences because the plate was crowded with colonies or the TempliPhi 
amplification resulted in very concentrated DNA for transformation. Also, as indicated 
above, about 12% of clones had extra mutations. The remaining clones were all mutant, and 
of these about 60-80% were unique mutants. The sequencing results for the arginine and 
cysteine libraries are provided below in Tables 16-2, and 16-3. 



1 


fable 16-2. Arginine Library Sequencing and Skim Milk Plate Results 


Colony 


Halo 


R14L 


R16Q 


R35F 


R61S 


R79T 


R123L 


R127Q 


R159Q 


R179Q 


m 


medium 




X 


X 










X 




R2 


yes 
















X 




R3 


yes 




X 








X 
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R4 


yes 




X 








X 








R5 


yes 




X 








X 








R6 


yes 




X 








X 








R7 


yes 


X 












X 


X 




R8 


yes 




X 








X 








R9 


yes 




















R10 


yes 


X 














1 


X 


R11 


yes 




















R12 


medium 




X 


X 










X 




R13 


yes 










X 










R14 


yes 




















R15 


yes 




















R16 


medium 




















R17 


no 








X 




X 


X 






R18 


medium 












X 


X 




X 


R19 


medium 




















R20 


yes 


X 












X 


X 




R21 


medium 




X 






X 




X 






R22 


small 




















R23 


yes 




X 






X 










R24 


yes 





















Table 16-3. Cysteine Library Sequencing and Skim Mil 


k Plate Results 


Colony 


Halo? 


C17S 


C33S 


C95S 


C105S 


C131S 


C158S 


C1 


no 


X 


X 










C2 


no 














C3 


yes 














C4 


yes 














C5 


no 


X 




X 








C6 


small 


X 






X 






C7 


no 






X 


X 


X 




C8 


yes 














C9 


no 














C10 


no 














C11 


small 














C12 


no 














C13 


no 


X 




X 








C14 


no 


X 


X 


X 






X 


C15 


no 














C16 


no 












X 


C17 


no 












X 


C18 


no 


X 




X 


X 




X 


C19 


yes 














C20 


no 














C21 


no 














C22 


no 








X 






C23 


no 


X 




X 
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IC24 lyes | | | | | | | 

Of the mutants identified in sequencing, the following mutants from the arginjne 
library (See, Table 16-4) were found to be of interest. See the Examples below for 
additional data regarding the properties of these mutants. 



Table 16-4. Arginine Mutants of Interest 


MUTANT 


SEQUENCE 


R1 


R16QR35FR159Q 


R2 


R159Q 


R3 


R16QR123L 


R7 


R14L R127Q R159Q 


R10B 


R14LR179Q 


R18 


R123LR127Q R179Q 


R21 


R16QR79TR127Q 


R23 


R16QR79T 


R10 


R14L R79T 



Importantly, the activity results indicated that mutations in the cysteine residues 
produced ASP proteases with very low or no activity, suggesting that the disulfide bridges 
play an important role in the stability of the molecule. However, it is not intended that the 
present invention be limited to any particular mechanism(s). 

EXAMPLE 17 

Expression of Homologous O. turbata Protease in S. lividans 
In this Example, expression of protease produced by O. turbata that is homologous 
to the protease 69B4 in S. lividans is described. Thus, this Example describes plasmids 
comprising polynucleotides encoding a polypeptide having proteolytic activity and used such 
vectors to transform a Streptomyces lividans host cell. The transformation methods used 
herein are known in the art (See e.g., U.S. Pat. No. 6,287,839; and WO 02/50245, herein 
incorporated by reference). 

The vector (i.e., plasmid) used in these experiments comprised a polynucleotide 
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encoding a protease of the present invention obtained from Oerskovia turbata DSM 20577. 
This plasmid was used to transform Streptomyces lividans. The final plasmid vector is 
referred to herein as u pSEA4CT-0.turbata." 

As with previous vectors, the construction of pSEA4CT-0.turbata made use of the 
pSEGCT plasmid vector (See, above). 

An Aspergillus niger ("A4") regulatory sequence operably linked to the structural 
gene encoding the Oerskovia turbata protease (Otp) was used to drive the expression of the 
protease. A fusion between the A4-regulatory sequence and the Oerskovia turbata signal- 
sequence, N-terminal prosequence and mature protease sequence (i.e., without the C- 
terminal prosequence) was constructed by f usion-PCR techniques known in the art, as an 
Xba\-BamH\ fragment. The polynucleotide primers for the cloning of Oerskovia turbata 
protease (Otp) in pSEA4CT were based on SEQ ID NO:67. The primer sequences used 
were: 

A4-turb Fw 

5'-C AG AG AC AG ACCCCCG GAG GTA ACC ATG G C ACG ATC ATTCTG G AG G ACG C-3' (SEQ 
IDNO:613) 



A4- turb RV 

5 , -GCGTCCTCCAGAATGATCGTGCCATGGTTACCTCCGGGGGTCTGTCTCTG-3 , (SEQ 
ID NO:614) 

A4- turb Bam Rv 

S'-ATCCGCTCGCGGATCCCCATTGTCAGCTCGGGCCCCCACCGTCAGAGGTCACGAG- 
3' (SEQ ID NO:615) 

A4-X6a1-FW 

5 , -GCAGCCTGAACTAGTTGCGATCCTCTAGAGATCGAACTTCAT-3 , (SEQ ID NO:616) 

The fragment was ligated into plasmid pSEA4CT digested with Xba\ and BamH\, 
resulting in plasmid pSEA4CT-0.turbata. 

The host Streptomyces lividans TK23 was transformed with plasmid vector 
pSEA4CT-0.turbata using the protoplast method described in the previous Example (i.e., 
using the method of Hopwood etal., supra). 

The transformed culture was expanded to provide two fermentation cultures in TS* 
medium. The composition of TS* medium was (g/L) tryptone (Difco) 16, soytone (Difco) 4, 
casein hydrolysate (Merck) 20, K 2 HP0 4 10, glucose 15, Basildon antifoam 0.6, pH 7.0. At 
various time points, samples of the fermentation broths were removed for analysis. For the 
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purposes of this experiment, a skim milk procedure was used to confirm successful cloning. 
30 mL of the shake flask supernatant was pipetted in punched out holes in skim milk agar 
plates and incubated at 37°C. 

The incubated plates were visually reviewed after overnight incubation for the 
presence of clearing zones (halos) indicating the expression of proteolytic enzyme. For 
purposes of this experiment, the samples were also assayed for protease activity and for 
molecular weight (SDS-PAGE). At the end of the fermentation, full length protease was 
observed by SDS-PAGE. 

A sample of the fermentation broth was assayed as follows: lOpL of the diluted 
supernatant was collected and analyzed using the Dimethylcasein Hydrolysis Assay 
described in Example 1 . The assay results of the fermentation broth of 2 clones clearly 
show that the polynucleotide from Oerskovia turbata encoding a polypeptide having 
proteolytic activity was expressed in Streptomyces lividans. 



EXAMPLE 18 

Expression of Homologous Cellulomonas and Cellulosimicrobium 

Proteases in S. lividans 

In this Example, expression of proteases produced by Cellulomonas cellasea DSM 
201 18 and Cellulosimicrobium cellulans DSM 204244 that are homologous to the protease 
69B4 in S. lividans is described. Thus, this Example describes plasmids comprising 
polynucleotides encoding a polypeptide having proteolytic activity and used such vectors to 
transform a Streptomyces lividans host cell. The transformation methods used herein are 
known in the art (See e.g., U.S. Pat. No. 6,287,839; and WO 02/50245, herein incorporated 
by reference). 

The final plasmid vectors are referred to as pSEA4CT-C.cellasea and pSEA4CT- 
Cm.cellulans. The construction of pSEA4CT-C.cellasea and pSEA4CT-Cm.cellulans made 
use of the pSEGCT plasmid vector described above. 

An Aspergillus niger ("A4") regulatory sequence operably linked to the structural 
gene encoding the Cellulomonas cellasea mature protease (Ccp) or alternatively, the 
structural gene encoding the Cellulosimicrobium cellulans mature protease (Cmcp) was 
used to drive the expression of the protease. A fusion between the A4-regulatory sequence 
and the 69B4 protease signal-sequence, N-terminal prosequence of the 69B4 protease 
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gene and mature sequence of the native protease gene obtained from genomic DNA of a 
strain of Micrococcineae (herein, Cellulomonas cellasea or Cellulosimicrobium cellulans) 
was constructed by fusion-PCR techniques, as a Xba\-BamH\ fragment. The polynucleotide 
primers for the cloning of Cellulomonas cellasea protease (Ccp) in pSEA4CT were based on 
SEQ ID NO:63, and are as follows: 

Asp-npro fw-cell 
5'- 

AGACCGACGAGACCCCGCGGACCATGGTCGACGTCATCGGCGGCAACGCGTACTAC- 
3' (SEQ ID NO:617) 

Cell-BH1-rv 
5'- 

TCAGCCGATCCGCTCGCGGATCCCCATTGTCAGCCCAGGACGAGACGCAGACCGTA-3' 
(SEQ ID NO:618) 

Asp-npro rv-cell 
5'- 

GTAGTACGCGTTGCCGCCGATGACGTCGACCATGGTCCGCGGGGTCTCGTCGGTCT- 
3' (SEQIDNO:619) 

Xba-1 fw A4 

5'-GCAGCCTGAACTAGTTGCGATCCTCTAGAGATCGAACTTCATGTTCGA-3' (SEQ ID 
NO:620) 

The polynucleotide primers for the cloning of Cellulosimicrobium cellulans protease 
(Cmcp) in pSEA4CT were based on SEQ ID NO:71 , and are as follows, 

ASP-npro fw cellu 

5'-ACCGAGGAGACCCCGCGGACCATGCACGGCGACGTGCGCGGCGGCGACCGCTA-3' 
(SEQ ID NO:621) 

ASP-npro rv cellu 

5'-TAGCGGTCGCCGCCGCGCACGTCGCCGTGCATGGTCCGCGGGGTCTCGTCGGT-3' 
(SEQ ID NO:622) 
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Cellu-BH1-rv 
5'- 

TCAGCCGATCCGCTCGCGGATCCCCATTGTCAGCGAGCCCGACGAGCGCGCTGCCCG 
AC-3' (SEQ ID NO:623) 

Xba-1fwA4 

5 , -GCAGCCTGAACTAGTTGCGATCCTCTAGAGATCGAACTTCATGTTCGA-3 , (SEQ ID 
NO:620) 

The host Streptomyces IMdansTK23 was transformed with plasmid vector 
pSEA4CT using the protoplast method described above (i.e., Hopwood etal., supra). The 
transformed culture was expanded to provide two fermentation cultures in TS* medium. The 
composition of TS* medium was (g/L) tryptone (Difco) 16, soytone (Difco) 4, casein 
hydrolysate (Merck) 20, K 2 HP0 4 10, glucose 15, Basildon antifoam 0.6, pH 7.0. At various 
time points, samples of the fermentation broths were removed for analysis. For the 
purposes of this experiment, a skim milk procedure was used to confirm successful cloning. 
30 pL of the shake flask supernatant was pipetted in punched out holes in skim milk agar 
plates and incubated at 37°C. 

The incubated plates were visually reviewed after overnight incubation for the 
presence of clearing zones (halos) indicating the expression of proteolytic enzyme. For 
purposes of this experiment, the samples were also assayed for protease activity and for 
molecular weight (SDS-PAGE). At the end of the fermentation full length protease was 
observed by SDS-PAGE. 

A sample of the fermentation broth was assayed as follows: 10mL of the diluted 
supernatant was taken and added to 190 pL AAPF substrate solution <conc. 1 mg/ml, in 0.1 
M Tris/0.005% Tween 80, pH 8.6). The rate of increase in absorbance at 410 nm due to 
release of p-nitroaniline was monitored (25°C). 

As in previous Examples, the results obtained clearly indicated that the 
polynucleotide from Cellulomonas cellasea or from Cellulosimicrobium cellulans, both 
encoding polypeptides having proteolytic activity were expressed in Streptomyces lividans. 

EXAMPLE 19 

Determination of the Crystal Structure of ASP Protease 

In this Example, methods used to determine the crystal structure of ASP protease 
are described. Indeed, high quality single crystals were obtained from purified ASP 
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protease. The crystallization conditions were as follows: 25% PEG 8000, 0.2M ammonium 
sulphate, and 15% glycerol. These crystallization conditions are cryo-protective, so transfer 
to a cryoprotectant was not required. The crystals were frozen in liquid nitrogen, and kept 
frozen during data collection using an Xstream (Molecular Structure). Data were collected 

5 with a R-axis IV (Molecular Structure), equipped with focusing mirrors. X-ray reflection data 
were obtained to 1.9A resolution. The space group was P2 1 2 1 2 1 , with cell dimensions 
a=35.65A, b=51 .82 A and c=76.86A. There was one molecule per asymmetric unit. 

The crystal structure was solved using the molecular replacement method. The 
program used was X-MR (Accelrys Inc.). The starting model for the molecular replacement 

10 calculations was Streptogrisin. It is clear from the electron density map obtained from X-MR 
that the molecular replacement solution is correct. Thus, 98% of the model was built 
correctly, with some minor errors that were fixed manually. The R-factor for data to 1 .9A 
was 0.23. 

The structure was found to largely consist of 0-sheets, with 2 very short a-helices, 
is and a longer helix toward the Crterminal end. There are two sets of p-sheets, with a 
considerable interface between them. The active-site is found in a cleft formed at this 
interface. The catalytic triad is formed by His 32, Asp 56, and Ser 137. Table 19-1 provides 
the atomic coordinates identified for ASP. 
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Table 1 9-1 Atomic Coordinates for ASP 
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20.031 24.816 1.00 14.04 c 

19.660 24.140 1.00 11.09 O 

20.033 22.945 1.00 10.83 N 

19.750 24.365 1.00 13.46 C 

20.681 25.965 1.00 11.82 N 

20.995 26.467 1.00 9.81 C 

19.785 26.644 1.00 11.77 C 

18.737 27.114 1.00 9.20 O 

19.927 26.255 1.00 10.03 N 
18.836 26.397 1.00 8.54 C 
17.979 25.156 1.00 9.57 C 
17.200 25.042 1.00 7.69 O 
18.119 24.224 1.00 9.01 N 
17.359 22.985 1.00 10.51 C 
17.293 22.349 1.00 14.65 C 
16.250 22.981 1.00 10.35 C 
16.144 22.644 1.00 13.61 O 
15.470 23.896 1.00 6.66 N 
17.952 21.976 1.00 12.30 C 
19.146 21.991 1.00 15.93 O 
17.107 21.079 1.00 11.08 N 
17.564 20.063 1.00 14.32 C 
16.392 19.541 1.00 14.61 C 
18.202 18.914 1.00 11.23 C 
17.987 18.747 1.00 12.54 O 
19.021 18.145 1.00 9.75 N 
19.691 16.988 1.00 12.42 C 

21.059 17.334 1.00 12.79 C 
22.150 17.790 1.00 14.12 C 
22.333 19.139 1.00 10.17 C 
23.366 19.572 1.00 12.49 C 
23.029 16.877 1.00 9.02 C 
24.066 17.294 1.00 10.92 C 
24.230 18.644 1.00 13.93 C 
25.261 19.070 1.00 12.50 O 
19.792 16.101 1.00 12.21 C 
19.589 16.583 1.00 11.38 O 
20.055 14.816 1.00 11.44 N 
20.144 13.946 1.00 13.35 C 
18.998 12.916 1.00 14.07 C 
19.098 12.086 1.00 13.63 O 
17.648 13.620 1.00 12.60 C 
21.452 13.194 1.00 14.66 C 
22.161 12.907 1.00 12.64 O 
21.749 12.881 1.00 14.05 N 
22.954 12.157 1.00 18.00 C 
23.931 13.068 1.00 17.58 C 
25.230 12.323 1.00 20.00 C 
24.212 14.327 1.00 21.47 C 
25.031 15.377 1.00 23.61 C 
22.485 11.030 1.00 16.40 C 
22.014 11.278 1.00 17.72 O 
22.603 9.794 1.00 18.83 N 
22.155 8.673 1.00 17.69 C 
20.659 8.791 1.00 18.86 C 
20.141 8.376 1.00 19.71 O 
19.964 9.380 1.00 17.62 N 
18.525 9.529 1.00 16.37 C 

18.060 10.703 1.00 17.10 C 
16.861 10.946 1.00 15.94 O 
19.002 11.438 1.00 17.27 N 
18.667 12.585 1.00 15.15 C 
19.558 12.586 1.00 19.68 C 
18.842 13.908 1.00 16.27 C 
19.882 14.159 1.00 12.16 O 
19.273 13.699 1.00 25.94 C 
18.069 13.393 1.00 31.69 C 

17.928 14.360 1.00 40.26 N 
17.572 15.630 1.00 42.65 C 
17.483 16.435 1.00 45.09 N 
17.302 16.091 1.00 41.89 N 
17.821 14.756 1.00 14.36 N 
17.892 16.047 1.00 18.05 C 
16.488 16.524 1.00 19.52 C 
18.533 17.011 1.00 18.51 C 
17.870 17.882 1.00 16.89 O 
16.542 17.742 1.00 24.25 O 
19.829 16.842- 1.00 15.76 N 
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35.802 20.801 1.00 15.23 N 
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30.964 15.979 1.00 12.67 N 

31.083 15.716 1.00 11.18 C 

32.314 14.844 1.00 12.63 C 

32.398 14.379 1.00 17.12 C 

33.527 13.376 1.00 20.85 C 

33.614 12.971 1.00 24.18 N 

32.744 12.171 1.00 24.05 C 

32.904 11.884 1.00 25.34 N 

31.708 11.670 1.00 25.91 N 

29.831 15.011 1.00 12.67 C 

29.316 14.096 1.00 11.46 O 

29.333 15.461 1.00 13.58 N 

28.147 14.865 1.00 13.24 C 

26.995 15.884 1.00 11.66 C 

27.450 17.007 1.00 13.55 O 

26.485 16.349 1.00 13.26 C 

28.558 14.335 1.00 13.42 C 

29.568 14.770 1.00 16.80 O 

27.778 13.406 1.00 16.51 N 

28.108 12.834 1.00 15.85 C 

27.033 12.894 1.00 16.64 C 

26.432 13.938 1.00 12.21 O 

26.788 11.753 1.00 15.51 N 

25.810 11.663 1.00 15.. 84 - C 

24.378 11.977 1.00 15.00 C 

23.977 11.742 1.00 15.60 O 

25.866 10.279 1.00 16.27 C 

23.614 12.510 1.00 17.17 N 

22.217 12.828 1.00 19.41 C 

21.946 13.953 1.00 19.21 C 

20.790 14.234 1.00 22.10 O 

23.001 14.603 LOO 15.20 N 

22.844 15.697 1.00 15.99 C 

23.746 15.501 1.00 15.02 C 

23.195 17.016 1.00 18.46 C 

24.349 17.257 1.00 16.96 O 

23.602 16.688 1.00 13.36 C 

23.375 14.195 1.00 11.46 C 

22.193 17.866 1.00 15.34 N 

22.407 19.158 1.00 16.12 C 

21.177 19.539 1.00 21.01 C 

22.704 20.228 1.00 17.24 C 

21.862 20.554 1.00 17.97 O 

20.748 18.431 1.00 29.21 C 

21.527 17.976 1.00 33.32 O 

19.505 17.982 1.00 33.03 N 

23.915 20.767 1.00 13.94 N 

24.378 21.807 1.00 14.43 C 

25.896 21.707 1.00 13.70 C 

23.985 23.178 1.00 15.01 C 

24.568 23.664 1.00 16.08 O 

26.395 20.358 1.00 8.95 C 

27.910 20.284 1.00 8.47 C 

25.931 20.179 1.00 12.27 C 

23.005 23.805 1.00 12.99 N 

22.529 25.119 1.00 12.18 C 

20.997 25.134 1.00 12.27 C 

20.310 24.029 1.00 16.54 C 

18.802 24.113 1.00 17.85 C 

20.679 24.170 1.00 19.65 C 

23.050 26.307 1.00 14.39 C 

23.239 26.228 1.00 14.53 O 

23.271 27.416 1.00 12.89 N 

23.761 28.635 1.00 14.83 C 

24.547 29.457 1.00 18.71 C 

22.519 29.391 1.00 12.67 C 

22.293 30.523 1.00 11.15 O 

21.711 28.742 1.00 13.59 N 

20.483 29.334 1.00 14.04 C 

19.282 28.809 1.00 14.08 C 

19.283 29.157 1.00 17.65 C 
18.099 28.560 1.00 19.50 C 
17.592 29.143 1.00 24.87 O 
17.658 27.386 1.00 17.48 N 
20.255 29.011 1.00 16.23 C 
20.786 28.035 1.00 15.48 O 
19.451 29.840 1.00 13.56 N 
19.133 29.648 1.00 12.57 C 
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19.754 30.748 1.00 10.81 C 

19.162 30.677 1.00 12.46 C 

21.271 30.563 1.00 10.56 C 

17.610 29.695 1.00 10.65 C 

16.968 30.565 1.00 11.44 O 

17.036 28.746 1.00 11.79 N 

15.586 28.673 1.00 10.87 C 

15.229 27.369 1.00 12.22 C 

13.743 27.065 1.00 12.04 C 

12.916 27.959 1.00 11.92 O 

13.397 25.798 1.00 10.15 N 

15.098 29.863 1.00 11.93 C 

15.597 30.092 1.00 11.67 O 

14.137 30.631 1.00 12.17 N 

13.640 31.766 1.00 9.29 C 

13.275 32.958 1.00 13.70 C 

12.126 32.662 1.00 16.27 C 

11.274 31.815 1.00 14.62 O 

12.088 33.384 1.00 18.77 N 

12.435 31.359 1.00 11.15 C 

11.823 32.189 1.00 10.62 O 

12.115 30.069 1.00 13.30 N 

10.998 29.495 1.00 13.21 C 

11.363 29.386 1.00 10.04 C 

12.223 28.170 1.00 11.82 C 

11.652 26.907 1.00 10.82 C 

12.435 25.775 1.00 12.83 C 

13.608 28.271 1.00 10.15 C 

14.402 27.142 1.00 10.33 C 

13.805 25.898 1.00 9.45 C 

14.579 24.770 1.00 10.77 O 

9.652 30.188 1.00 16.68 C 

8.835 30.228 1.00 18.39 O 

9.433 30.723 1.00 18.33 N 

8.193 31.411 1.00 20.49 C 

8.390 32.926 1.00 21.53 C 

8.424 33.457 1.00 25.72 O 

7.775 30.919 1.00 21.06 C 

6.868 31.473 1.00 20.62 O 

8.452 29.870 1.00 17.80 N 

8.156 29.295 1.00 18.95 C 

9.128 29.752 1.00 17.33 C 

8.942 29.470 1.00 16.27 O 

10.173 30.462 1.00 15.79 N 

11.142 30.939 1.00 16.07 C 

12.585 30.734 1.00 17.80 C 

12.876 30.041 1.00 15.22 O 

13.492 31.342 1.00 17.07 N 

14.914 31.226 1.00 19.85 C 

15.528 30.090 1.00 23.00 C 

15.176 28.701 1.00 29.54 C 

14.404 27.910 1.00 35.50 C 

13.941 26.626 1.00 39.02 N 

13.094 26.493 1.00 41.51 C 

12.611 27.566 1.00 38.71 N 

12.716 25.285 1.00 43.02 N 

•15.695 32.510 1.00 18.62 C 

15.313 33.353 1.00 16.21 O 

16.793 32.646 1.00 15.48 N 

17.660 33.810 1.00 14.56 C 

18.040 34.349 1.00 14.30 C 

18.967 35.542 1.00 16.79 C 

16.787 34.729 1.00 18.13 C 

18.930 33.375 1.00 15.88 C 

19.588 32.421 1.00 14.01 O 

19.270 34.057 1.00 15.03 N 

20.472 33.698 1.00 18.02 C 

20.512 34.403 1.00 21.88 C 

19.659 33.760 1.00 29.23 C 

20.030 32.311 1.00 29.08 C 

21.194 31.984 1.00 31.12 O 

19.034 31.434 1.00 32.61 N 

21.722 34.067 1.00 17.67 C 

21.786 35.128 1.00 18.79 O 

22.709 33.179 1.00 15.10 N 

23.960 33.411 1.00 17.88 C 

24.494 32.107 1.00 16.36 C 

25.792 32.374 1.00 19.17 C 
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23.454 31.534 1.00 15.85 C 

24.952 33.930 1.00 18.78 C 

25.494 33.163 1.00 19.15 O 

25.175 35.240 1.00 20.30 N 

26.091 35.879 1.00 20.84 C 

25.725 37.348 1.00 20.26 C 

27.568 35.751 1.00 20.34 C 

28.405 35.594 1.00 21.44 0 

27.886 35.826 1.00 18.33 N 

29.267 35.721 1.00 15.96 C 
29.381 35.558 1.00 18.35 C 
28.433 35.117 1.00*16.24 O 
30.534 35.924 1.00 16.53 N 
30.767 35.798 1.00 14.08 C 
31.528 34.498 1.00 14.33 C 
31.510 36.988 1.00 14.07 C 
32.413 36.818 1.00 15.60 O 
32.770 34.323 1.00 18.31 C 
34.035 34.404 1.00 21.61 N 
32.936 34.064 1.00 19.95 C 
34.289 33.994 1.00 18.84 N 
34.929 34.202 1.00 22.08 C 
31.124 38.193 1.00 14.33 N 
31.758 39.405 1.00 13.94 C 
31.449 40.612 1.00 15.26 C 
31.243 39.690 1.00 14.65 C 
30.042 39.855 1.00 11.10 O 
31.904 40.347 1.00 16.89 O 
32.147 41.854 1.00 16.68 C 
32.157 39.756 1.00 15.86 N 
31.801 40.016 1.00 17.16 C 
31.152 41.375 1.00 19.39 C 
31.608 42.395 1.00 18.84 O 
33.034 39.877 1.00 17.44 C 
30.088 41.373 1.00 16.82 N 
29.352 42.584 1.00 14.95 C 
29.832 43.119 1.00 15.66 C 
30.204 42.355 1.00 15.62 O 
27.861 42.291 1.00 10.05 C 
29.822 44.447 1.00 15.05 N 

30.259 45.100 1.00 16.15 C 
30.498 46.535 1.00 16.59 C 

29.260 45.022 1.00 17.33 C 
28.076 44.741 1.00 14.79 O 
29.522 45.448 1.00 17.98 C 
29.425 46.728 1.00 15.94 C 
29.751 45.257 1.00 18.24 N 
28.894 45.221 1.00 17.32 C 
29.658 45.672 1.00 16.39 C 
28.678 45.932 1.00 19.70 C 
30.665 44.609 1.00 18.18 C 
27.770 46.211 1.00 17.15 C 
28.005 47.254 1.00 17.16 O 
26.556 45.878 1.00 13.56 N 

25.420 46.755 1.00 13.61 C 
24.583 46.314 1.00 14.54 C 
23.422 46.695 1.00 13.48 O 
25.175 45.497 1.00 12.12 N 
24.486 45.014 1.00 13.41 C 
25.457 44.239 1.00 10.87 C 
26.463 45.090 1.00 12.36 O 
23.284 44.134 1.00 13.34 C 

23.268 43.383 1.00 9.90 O 
22.274 44.252 1.00 11.16 N 
21.057 43.475 1.00 14.34 C 
19.925 .44.136 1.00 14.73 C 
21.389 42.119 1.00 14.46 C 
21.920 42.047 1.00 13.83 O 
21.092 41.048 1.00 14.27 N 
21.370 39.707 1.00 9.84 .C 
22.629 39.113 1.00 11.32 C 
23.859 39.904 1.00 9.34 C 
22.467 39.126 1.00 10.97 C 
20.209 38.763 1.00 9.69 . C 

19.421 38.976 1.00 10.59 0 
20.094 37.727 1.00 10.10 N 
19.027 36.752 1.00 9.94 C 
17.983 36.845 1.00 11.63 C 



WO 2005/052146 



PCT/US2004/039066 





ATOM 


655 


SG 


PVQ 
CIO 


A 


QC 


i 

X-J 


.412 




ATOM 


656 




pVO 


A 


QC 


1 £ 


. 566 




ATOM 


657 


o 


PVO 
Lib 


A 


QC 


1 c. 


. 808 




ATOM 


658 


N 


Ann 
ARo 


A 


JO 


1 7 


. 424 


5 


ATOM 


659 


CA 


7\pn 
A1\V9 


7V 

A 


JO 


1 7 
X ' 


. 570 




ATOM 


660 


pr 


Ai\v> 


2V 
A 


Q 


1 Q 
17 


. 050 




ATOM 


661 


CG 


A"D/"» 


7V 

A 


DO 


1 Q 

x y 


. 326 




ATOM 


662 


An 

\*>XJ 


Al\V» 


7V 

A 


jO 


on 


. 808 




ATOM 


663 


NE 


Al\l3 


7V 
A 


3D 


71 
ZX 


. 355 


in 

1 U 


ATOM 


664 


CZ 


A-TVV3 


A 


jO 


Z 1/ 


. 957 




ATOM 


665 


TATtM 
INflX 


aTSO 

Artvj 


A 


y o 


1 Q 

xy 


. 995 






O DO 


TVTtIO 




a 
A 


y o 


ZX 


coq 

. DZ j 




ATOM 
A X \JCl 


fifi7 
OD / 


/"« 


atsr 1 
AKLa 


a 

A 


y o 


1 7 


n&ft 

. UO 0 




at 1 DM 
A x 


ODO 




aop 

AKu 


a 

A 


y o 


1 7 
X / 


017 


1 ^ 

1 0 


ax \Jrx 




TvT 
IN 


CPT5 


a 

A 


Q7 

y / 


xo 


. 442 




ATOM 


670 


LA 


CT?T> 
OlVK. 


a 

A 


Q7 

y / 


1 

13 


. 925 






O / X 


Ld 


SER 


a 

A 


Q7 

y / 


1 4 
Xf» 


. 406 




ATOM 


672 


rv/2 


QPT3 


a 

A 


Q7 

y / 


1 7 

X J 


. 893 




aTOM 


O /J 




oiiK 


a 

A 


Q7 

y / 


1 ft 
xo 


. 607 


on 
ZU 


Al 


O /<4 


u 


OPT? 


a 

A 


Q7 

y / 


1 £ 


. O Oft 




a TOM 


O / 3 


IN 


pT V" 


a 

A 




1 7 


, aIj 




aTPM 


o / o 


r«a 


VjXjX 


a 

A 


y O 


1 7 

X. / 


Q "3 Q 






£77 
D / / 




OT.V 


a 


y o 


1 7 
X / 


. 853 




Al vJItt 


D / O 


vJ 


pT,V 


a 
A 


Qfl 

JO 


1 7 
X / 


ton 
. OO j 


oc 

£3 


a mriM 
AlurJ 


£7Q 
O / y 


IN 




a 

A 


QQ 

y y 


1 7 
X / 


QQ-3 

. yyo 




a tom 

Alvll 


o ou 


oa 

V— A 


OTJD 


a 

A 


QQ 

y y 


1 7 


. 884 




AlAjlTi 


P1 
D OX 






a 

A 


QQ 

y y 


1 7 
X / 


. oz 0 




Al \JPi 


D OZ 


VJV3 


O-tLK 


a 

A 


QQ 

y y 


XO 


one; 




7\ mnM 

Aiun 


DOJ 


r* 
i~ 


o£ii\. 


a 

A 


QQ 

yy 


1 Q 


n77 

. u / 0 


OVJ 


AlUm 


£ GA 
D 041 




OT7T5 


a 


Q Q 

y y 


X.O i 


Q70 

. y i z 




A 1 \JPl 


D OD 


TVl 


THR 


a 

A 


1 nn 


on 

ZU i 


1 

. X 73 




A 1 vJiXl 


ODD 


L-A 


THR 


a 
A 


1UU 


01 

zx . 


. 3 03 




AIUW 


COT 

o o / 




THR 


a 
A 


1 nn 
luu 


. oo 
zz . 


. o^3 




ATOM 


cop 


Apl 


THR 


A 


i nn 

1UU 


00 

zz , 


, 3 \>*i 


oo 


ATOM 


boy 




THR 


A 


1 nn 
1UU 


Z J . 


. oou 




Al 


oyu 


/-i 

K. 


THR 


a 
A 


i nn 


01 

ZX . 


• 3fi / 




AiUW 


cqi 

oyi 


C\ 
U 




a 
A 


1 nn 


01 

ZX . 


.OOO 




Aiun 


con 


N 


THR 


A 


i ni 

1U1 


Ol 
ZX i 


• J17 




AivJJXI 




CA 


THR 


A 


i ni 


01 

ZX . 


A ft 
. O O 


>m 
<*u 


aTfYiwr 
AlUn 


oy^ 




THR 


a 
A 


i ni 


oo 

zz , 


ACQ 




Aiuja 






THR 


a 
A 


i m 

XUl 


00 
zz . 


mi 

. UJJL 




AlUl'l 


ojd 


Pf20 


THR 


a 

A 


i ni 

1U1 


23 


847 




A 1 Vjl*l 


OS / 


/-I 
v_ 


l rLrv 


a 

A 


i ni 

J. ux 


on 

Z \J i 


1 Rl 
• X33 




a "pom 


698 


Q 






1 m 

X ux 


20 


. 078 


Ati 
HD 


a TOM 


O j 


"NT 
IN 


GXjY* 




102 


19! 


!ll9 




ATOM 


700 


x-A 


GLY 




102 


17. 


.829 




ATOM 


701 




GLiY 




102 


17. 


.578 




A 1 VJ1*1 


7 no 
/ uz 




GLY 


a^ 


102 


17. 


.846 




ATOM 
Al wri 


/ U3 


in 






103 


17. 


.067 


50 


ATOM 


704 


PA 

v_~f\ 


TRP 


A 


103 


16, 


.716 




ATOM 


705 


CB 


TRP 




103 


15. 


.370 




ATOM 


706 


CG 


TRP 




103 


14, 


.837 




ATOM 


707 


CD2 


TRP 


A 


103 


13, 


.964 




ATOM 


/UO 


CE2 


TRP 


A 


103 


13, 


.680 


55 


ATOM 




CE3 


TRP 


^ 


103 


13. 


.387 




ATOM 


710 


CD1 


TRP 


a^ 


103 


15. 


.050 




ATOM 


71 1 


1NHX 


TRP 


A 


103 


14.357 




ATOM 


71 0 
/ x^ 


CZ2 




a^ 


103 


12. 


,852 




ATVYM' 


71 ** 




TOD 

TRJr 


a 

A 


i m 


12, 


,561 




a TDM 


714 


CH2 




a. 


103 


12. 


,303 




ATOM 


71 R 


Q 


TRP 


a_ 


103 


17. 


,790 




ATOM 


716 


n 


TRP 




103 


18. 


,082 




a "pom 

A 1 WiU 


71 7 

/ X / 


IN 


HXS 


a^ 


104 


18. 


,386 




ATOM 


71 ft 


PA 
V.A 


HXS 


a^ 


104 


19. 


.434 


RC 
OO 


a TOM 
A 1 AjlU 


71 Q 


CB 


HXS 


a^ 


104 


20. 


,806 




ATOM 


720 


CG 


HIS 


A 


104 


21. 


106 




ATOM 


721 


CD2 


HIS 


A 


104 


20. 


822 




ATOM 


722 


ND1 


HIS 


A 


104 


21. 


684 




ATOM 


723 


CE1 


HIS 


A 


104 


21. 


740 


70 


ATOM 


724 


NE2 


HIS 


A 


104 


21. 


222 




ATOM 


725 


C 


HIS 


A 


104 


19. 


283 




ATOM 


726 


0 


HIS 


A 


104 


18. 


959 




ATOM 


727 


N 


CYS 


A 


105 


19. 


545 




ATOM 


728 


CA 


CYS 


A 


105 


19. 


408 


75 


ATOM 


729 


CB 


CYS 


A 


105 


18. 


278 




ATOM 


730 


SG 


CYS 


A 


105 


16. 


817 




ATOM 


731 


C 


CYS 


A 


105 


20. 


657 




ATOM 


732 


O 


CYS 


A 


105 


21. 


720 



229- 



17.059 38.410 1.00 13.27 S 

19.624 35.359 1.00 10.91 C 

20.551 35.061 1.00 11.33 O 

19.070 34.515 1.00 9.30 N 

19.496 33.135 1.00 7.08 C 

19.767 32.827 1.00 9.79 C 
20.069 31.353 1.00 10.80 C 
19.966 31.011 1.00 10.58 C 
18.643 31.312 1.00 11.86 N 
17.506 30.747 1.00 10.78 C 
17.500 29.831 1.00 10.18 N 
16.365 31.103 1.00 12.39 N 
18.397 32.211 1.00 9.14 C 
17.214 32.499 1.00 10.81 O 
18.792 31.104 1.00 8.35 N 
17.841 30.134 1.00 8.67 C 
17.976 29.984 1.00 10.18 C 
16.991 29.094 1.00 10.36 O 
18.169 28.810 1.00 9.06 C 
19.313 28.353 1.00 10.38 O 
17.168 28.209 1.00 9.45 N 
17.365 26.947 1.00 8.70 C 
16.131 26.070 1.00 11.12 C 
15.021 26.569 1.00 9.32 O 
16.320 24.762 1.00 13.12 N 
15.222 23.805 1.00 13.49 C 
15.784 22.414 1.00 16.87 C 
16.381 21.906 1.00 16.57 O 
14.272 23.709 1.00 13.85 C 
13.230 23.060 1.00 10.18 0 
14.617 24.331 1.00 9.80 N 
13.749 24.266 1.00 12.00 C 
14.572 24.075 1.00 13.43 C 
15.297 22.844 1.00 15.69 0 
13.667 24.044 1.00 13.71 C 
12.845 25.477 1.00 13.37 C 
11.667 25.332 1.00 9.57 O 
13.389 26.668 1.00 11.47 N 
12.613 27.893 1.00 10.42 C 
13.277 28.851 1.00 12.08 C 
14.607 29.151 1.00 10.91 O 
13.334 28.216 1.00 11.99 C 
12.410 28.633 1.00 13.44 C 
11.617 29.566 1.00 12.54 0 
13.128 28.217 1.00 10.42 N 
12.979 28.860 1.00 9.53 C 
13.835 30.087 1.00 10.82 C 
15.041 30.096 1.00 8.61 O 
13.190 31.132 1.00 9.62 N 
13.845 32.383 1.00 11.61 C 
13.289 32.865 1.00 11.52 C 
13.868 34.145 1.00 13.15 C 
14.998 34.282 1.00 12.58 C 
15.147 35.655 1.00 15.29 C 
15.896 33.375 1.00 11.72 C 
13.397 35.404 1.00 16.94 C 
14.156 36.320 1.00 16.85 N 
16.155 36.147 1.00 11.23 C 
16.900 33.865 1.00 12.19 C 
17.019 35.240 1.00 13.20 C 
13.659 33.448 1.00 12.90 C 
12.539 33.872 1.00 9.69 O 

14.768 33.872 1.00 10.38 N 
14.724 34.890 1.00 12.11 C 
14.734 34.226 1.00 12.14 C 
13.474 33.477 1.00 12.45 C 
13.110 32.204 1.00 14.29 C 
12.375 34.072 1.00 13.64 N 
11.384 33.197 1.00 14.53 C 
11.804 32.058 1.00 12.11 N 
15.898 35.839 1.00 12.75 C 
17.014 35.426 1.00 10.16 0 
15.650 37.114 1.00 10.52 N 
16.703 38.102 1.00 13.24 C 
16.318 39.049 1.00 13.49 C 
15.612 38.216 1.00 14.12 S 
17.057 38.896 1.00 13.65 C 
16.465 38.720 1.00 13.71 0 
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18.042 39.770 1.00 11.96 N 

18.499 40.583 1.00 8.39 C 

19.662 41.404 1.00 8.29 C 

19.720 41.723 1.00 9.88 0 

20.587 41.748 1.00 10.00 N 

21.749 42.529 1.00 10.90 C 

21.607 44.021 1.00 15.03 C 

21.490 44.138 1.00 19.34 0 

20.379 44.630 1.00 19.07 C 

23.021 42.003 1.00 11.41 C 

22.986 41.349 1.00 10.42 0 

24.150 42.282 1.00 9.46 N 
25.430 41.835 1.00 8.75 C 
26.533 41.895 1.00 10.61 C 
27.892 41.613 1.00 8.96 C 

26.214 40.905 1.00 11.64 C 
27.223 40.898 1.00 10.98 C 
25.788 42.798 1.00 12.89 C 
25.842 44.010 1.00 12.63 O 
26.026 42.263 1.00 12.30 N 
26.372 43.109 1.00 12.21 C 
25.557 42.714 1.00 16.03 C 

25.662 41.304 1.00 16.72 0 
24.097 43.079 1.00 19.29 C 
27.855 43.094 1.00 14.26 C 
28.353 43.994 1.00 15.02 0 
28.563 42.078 1.00 13.73 N 

29.985 41.967 1.00 14.67 C 

30.215 41.668 1.00 15.41 C 
30.611 40.879 1.00 12.70 C 
29.924 39.982 1.00 13.13 O 
31.921 40.982 1.00 13.52 N 
32.680 40.019 1.00 11.96 C 
33.286 40.689 1.00 15.77 C 
32.338 41.357 1.00 19.65 C 

33.151 42.113 1.00 17.73 C 
31.462 40.313 1.00 14.60 C 
33.798 39.501 1.00 15.77 C 
34.169 40.151 1.00 15.15 0 
34.330 38.332 1.00 12.74 N 
35.409 37.729 1.00 17.38 C 

36.663 38.605 1.00 24.14 C 
37.053 38.980 1.00 26.37 C 
36.726 40.067 1.00 33.64 O 
37.744 38.076 1.00 34.88 N 
35.023 37.495 1.00 16.99 C 
35.850 37.643 1.00 14.78 0 
33.766 37.131 1.00 16.31 N 
33.280 36.872 1.00 19.39 C 
31.757 37.020 1.00 18.03 C 
31.349 38.374 1.00 21.23 0 
33.650 35.450 1.00 19.59 C 
33.946 34.620 1.00 20.53 0 
33.634 35.180 1.00 20.43 N 
33.957 33.857 1.00 21.50 C 
35.259 33.878 1.00 23.57 C 
36.384 34.009 1.00 27.72 0 
32.824 33.406 1.00 21.25 C 
32.181 34.222 1.00 21.50 0 
32.569 32.106 1.00 19.94 N 
31.515 31.542 1.00 19.21 C 
30.203 31.350 1.00 20.02 C 
29.782 32.663 1.00 23.26 C 
30.380 30.276 1.00 19.81 C 

31.986 30.183 1.00 17.61 C 
32.857 29.561 1.00 16.06 O 
31.419 29.729 1.00 15.49 N 
31.791 28.441 1.00 15.82 C 
32.380 28.569 1.00 16.00 C 
33.577 29.355 1.00 19.34 O 
32.721 27.189 1.00 13.81 C 
30.572 27.535 1.00 16.67 C 
29.576 27.828 1.00 15.27 O 
30.663 26.451 1.00 14.40 - N 
29.609 25.453 1.00 18.79 C 
29.553 24.886 1.00 16.32 C 
29.168 25.871 1.00 19.43 C 
29.977 26.057 1.00 17.65 C 
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29.594 26.911 1.00 19.53 C 

27.967 26.569 1.00 19.30 c 

27.577 27.421 1.00 23.56 C 

28.392 27.588 1.00 21.66 C 

27.991 28.417 1.00 18.29 O 

29.985 24.330 1.00 16.91 C 

31.107 24.288 1.00 18.67 O 

29.058 23.400 1.00 17.80 N 

29.417 22.317 1.00 18.47 C 

28.165 21.434 1.00 19.17 c 

30.679 21.567 1.00 22.88 C 

31.426 21.064 1.00 22.73 O 

27.663 23.277 1.00 19.00 C 

27.442 21.799 1.00 22.37 C 
30.923 21.509 1.00 21.24 N 
32.101 20.819 1.00 22.24 C 

31.930 20.495 1.00 23.42 C 
33.356 21.665 1.00 24.28 C 
34.464 21.142 1.00 24.95 O 
30.815 19.522 1.00 25.42 C 

29.443 20.104 1.00 29.77 C 

29.252 21.310 1.00 28.43 O 
28.552 19.350 1.00 28.50 O 
33.181 22.979 1.00 21.77 N 
34.323 23.857 1.00 21.24 C 
34.099 25.215 1.00 18.48 C 
32.991 25.549 1.00 16.77 O 
35.169 25.994 1.00 15.75 N 
35.098 27.332 1.00 15.58 C 
36.151 28.242 1.00 20.73 C 
35.973 28.218 1.00 22.66 O 
36.020 29.675 1.00 19.40 C 
35.292 27.393 1.00 13.13 C 
36.105 26.665 1.00 12.67 O 
34.523 28.263 1.00 10.86 N 
34.591 28.460 1.00 11.99 C 
33.263 28.093 1.00 10.42 C 
33.290 28.534 1.00 11.35 C 
33.044 26.587 1.00 7.40 ' C 
34.875 29.951 1.00 13.37 C 
34.199 30.786 1.00 13.64 O 
35.870 30.288 1.00 14.96 N 
36.221 31.691 1.00 17.69 C 
37.693 31.903 1.00 19.72 C 
38.079 31.374 1.00 28.94 C 
39.545 31.642 1.00 34.45 C 
39.934 31.084 1.00 34.81 N 
40.109 29.787 1.00 37.65 C 
39.936 28.901 1.00 40.89 N 
40.450 29.373 1.00 35.27 N 
35.970 32.188 1.00 15.87 C 
35.538 31.430 1.00 14.52 O 
36.231 33.477 1.00 14.67 N 
36.064 34.084 1.00 12.41 C 
34.659 34.085 1.00 13.04 C 

34.477 34.106 1.00 11.06 O 
33.666 34.092 1.00 10.63 N 
32.270 34.063 1.00 11.37 C 
31.455 33.293 1.00 10.04 C 
31.810 31.820 1.00 8.76 C 
30.973 31.236 1.00 12.07 C 
31.559 31.072 1.00 10.99 C 
31.595 35.413 1.00 11.50 C 
31.958 36.424 1.00 13.58 O 
30.595 35.398 1.00 9.91 N 
29.829 36.586 1.00 10.95 C 

29.253 36.502 1.00 12.17 C 
28.348 37.704 1.00 11.45 C 
30.394 36.417 1.00 12.38 C 

29.931 36.143 1.00 11.96 C 
28.668 36.579 1.00 11.47 C 
27.898 35.620 1.00 11.84 O 
28.557 37.627 1.00 12.21 N 

27.478 37.716 1.00 14.56 C 
27.969 38.444 1.00 14.59 C 
26.921 38.616 1.00 23.58 C 
27.473 39.493 1.00 26.48 C 
26.489 39.769 1.00 36.15 N 



WO 2005/052146 



PCT/US2004/039066 





ATOM 


889 


CZ 


ARG A 127 


31 . 209 




ATOM 


890 


NH2 


: ARG A 127 


32 .127 




ATOM 


891 


NH1 


ARG A 127 


31.156 




ATOM 


892 


C 


ARG A 127 


25. 221 


5 


ATOM 


893 


O 


ARG A 127 


24 . 554 




ATOM 


894 


N 


THR A 128 


25.434 




ATOM 


895 


CA 


THR A 128 


24 . 867 




ATOM 


896 


CB 


THR A 12 8 


23 .547 




ATOM 


897 


OG1 


THR A 128 


23 .835 


10 


ATOM 


898 


CG2 


THR A 128 


22 . 668 




ATOM 


899 


C 


THR A 128 


25.778 




ATOM 


900 


O 


THR A 128 


26.790 




ATOM 


901 


N 


THR A 129 


25.391 




ATOM 


902 


CA 


THR A 129 


26.132 


15 


ATOM 


903 


CB 


THR A 129 


26 . 099 




ATOM 


904 


OG1 


THR A 129 


24.737 




ATOM 


905 


CG2 


THR A 129 


26.782 




ATOM 


906 


C 


THR A 129 


25.503 




ATOM 


907 


O 


THR A 129 


25.820 


20 


ATOM 


908 


N 


VAL A 130 


24.601 




ATOM 


909 


CA 


VAL A 130 


23 .923 




ATOM 


910 


CB 


VAL A 13 0 


22.662 




ATOM 


911 


CGI 


VAL A 130 


21.913 




ATOM 


912 


CG2 


VAL A 130 


21.755 


25 


ATOM 


913 


C 


VAL A 130 


24 .872 




ATOM 


914 


O 


VAL A 130 


25.655 




ATOM 


915 


N 


CYS A 131 


24.804 




ATOM 


916 


CA 


CYS A 131 


25.658 




ATOM 


917 


CB 


CYS A 131 


25.939 


30 


ATOM 


918 


SG 


CYS A 131 


24.447 




ATOM 


919 


C 


CYS A 131 


24.957 




ATOM 


920 


O 


CYS A 131 


23.739 




ATOM 


921 


N 


ALA A 132 


25.723 




ATOM 


922 


CA 


ALA A 132 


25.141 


35 


ATOM 


923 


CB 


ALA A 132 


24.724 




ATOM- 


924 


C 


ALA A 132 


26.086 




ATOM 


925 


O 


ALA A 132 


27.294 




ATOM 


926 


. N 


GLU A 133 


25.508 




ATOM 


927 


CA 


GLU A 133 


26.243 


40 


ATOM 


928 


CB 


GLU A 133 


25.732 




ATOM 


929 


CG 


GLU A 133 


26.808 




ATOM 


930 


CD 


GLU A 133 


27.336 




ATOM 


931 


OE1 


GLU A 133 


27.870 




ATOM 


932 


OE2 


GLU A 133 


27.214 


45 


ATOM 


933 


C 


GLU A 133 


25.919 




ATOM 


934 


O 


GLU A 133 


24 .915 




ATOM 


935 


N 


PRO A 134 


26.761 




ATOM 


936 


CA 


PRO A 134 


26 .527 




ATOM 


937 


CB 


PRO A 134 


27 .558 


50 


ATOM 


938 


C 


PRO A 134 


25.093 




ATOM 


939 


O 


PRO A 134 


24 .468 




ATOM 


940 


CD 


PRO A 134 


28 .022 




ATOM 


941 


CG 


PRO A 134 


28.708 




ATOM 


942 


* N 


GLY A 135 


24.577 


55 


ATOM 


943 


CA 


GLY A 135 


23 .228 




ATOM 


944 


C 


GLY A 135 


22 .114 




ATOM 


945 


O 


GLY A 135 


20. 982 




ATOM 


946 


N 


ASP A 136 


22 .425 




ATOM 


947 


CA 


ASP A 136 


21.451 


60 


ATOM 


948 


CB 


ASP A 136 


21 .957 




ATOM 


949 


C 


ASP A 136 


21 .239 




ATOM 


950 


O 


ASP A 136 


20.270 




ATOM 


951 


CG 


ASP A 136 


21.907 




ATOM 


952 


OD2 


ASP A 136 


21.038 


65 


ATOM 


953 


OD1 


ASP A 136 


22 . 732 




ATOM 


954 


N 


SER A 137 


22 . 159 




ATOM 


955 


CA 


SER A 137 


22.089 




ATOM 


956 


CB 


SER A 137 


23.167 




ATOM 


957 


C 


SER A 137 


20.723 


70 


ATOM 


958 


O 


SER A 137 


20.110 




ATOM 


959 


OG 


SER A 137 


24.460 




ATOM 


960 


N 


GLY A 138 


20.264 




ATOM 


961 


CA 


GLY A 138 


18.974 




ATOM 


962 


C 


GLY A 138 


17.863 


75 


ATOM 


963 


O 


GLY A 138 


16.759 




ATOM 


964 


N 


GLY A 139 


18.171 




ATOM 


965 


CA 


GLY A 139 


17.202 




ATOM 


966 


C 


GLY A 139 


16.675 



-232- 



26.033 38.865 1.00 38.33 C 

25.139 39.211 1.00 41.31 N 

26.472 37.616 1.00 41.25 N 

26.324 38.485 1.00 12.51 C 

26.548 39.495 1.00 10.73 O 

25.098 38.011 1.00 11.75 N 

23.924 38.667 1.00 11.43 C 

23.501 37.998 1.00 12.42 C 

22.848 36.751 1.00 11.64 O 

24.719 37.728 1.00 8.69 C 

22.698 38.622 1.00 13.02 C 

22.680 37.914 1.00 12.78 O 

21.674 39.381 1.00 11.69 N 

20.419 39.456 1.00 12.47 C 

19.827 40.878 1.00 12.66 C 

19.612 41.277 1.00 11.15 O 

20.766 41.859 1.00 12.84 C 

19.399 38.506 1.00 15.23 C 

18.211 38.564 1.00 10.87 O 

19.870 37.646 1.00 14.09 N 

19.006 36.680 1.00 12.55 C 

19.694 36.103 1.00 13.46 C 

18.730 35.195 1.00 15.05 C 

20.178 37.234 1.00 10.45 C 
18.692 35.521 1.00 13.62 C 
19.546 35.120 1.00 17.44 O 
17.468 34.997 1.00 10.87 N 
17.047 33.886 1.00 12.09 C 
15.541 33.966 1.00 12.10 C 
14.512 33.745 1.00 14.96 S 
17.343 32.568 1.00 12.93 C 
17.506 32.532 1.00 11.56 O 
17.403 31.486 1.00 13.76 N 
17.676 30.181 1.00 14.09 C 
19.141 30.089 1.00 13.62 C 
17.337 29.042 1.00 17.97 C 

17.179 29.237 1.00 15.14 O 
17.215 27.853 1.00 13.21 N 
16.900 26.639 1.00 18.49 C 
15.592 26.039 1.00 21.95 C 
14.614 25.652 1.00 27.91 C 
13.850 26.840 1.00 31.31 C 
14.494 27.767 1.00 28.79 O 
12.606 26.846 1.00 28.57 O 
18.051 25.693 1.00 15.23 C 
18.738 25.866 1.00 16.37 O 
18.276 24.680 1.00 16.75 N 
19.366 23.725 1.00 17.31 C 
19.082 22.638 1.00 17.01 C 
19.449 23.177" 1.00 18.87 C 
20.515 23.204 1.00 21.16 O 
17.572 24.385 1.00 14.41 C 
18.528 23.429 1.00 15.96 C 
18.329 22.683 1.00 13.73 N 
18.315 22.138 1.00 11.51 C 
18.674 23.112 1.00 12.22 C 
18.933 22.696 1.00 10.70 O 
18.676 24.405 1.00 9.59 N 
19.019 25.441 1.00 10.66 C 
18.550 26.808 1.00 9.43 C 
20.533 25.485 1.00 9.56 C 
21.018 26.076 1.00 7.80 O 
17.044 26.969 1.00 12.00 C 
16.399 26.348 1.00 14.65 O 
16.510 27.737 1.00 11.73 O 
21.270 24.867 1.00 11.68 N 
22.728 24.831 1.00 9.45 C 
23.298 23.902 1.00 12.71 C 
23.231 24.381 1.00 12.56 C 
22.671 23.470 1.00 9.42 O 
23.160 24.466 1.00 11.89 O 
24.298 25.027 1.00 12.50 N 
24.873 24.698 1.00 10.84 C 
24.228 25.497 1.00 11.17 C 
24.774 25.583 1.00 10.27 0 
23.075 26.090 1.00 9.62 N 
22.326 26.877 1.00 11.99 C 
22.997 28.135 1.00 9.04 C 
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17.222 33.317 31.006 1.00 9.94 

15.784 32.133 32.261 1.00 10.31 

14.650 32.358 31.373 1.00 12.92 

13.528 31.331 31.662 1.00 17.07 

13.026 31.491 33.088 1.00 15.81 

12.394 31.505 30.678 1.00 19.48 

14.028 33.757 31.358 1.00 12.62 

13.648 34.253 30.302 1.00 11.62 

13.927 34.405 32.510 1.00 12.76 

13.328 35.736 32.537 1.00 15.21 

13.268 36.249 33.976 1.00 13.89 

12.353 35.396 34.841 1.00 19.50 

11.367 34.848 34.347 1.00 19.07 

12.667 35.283 36.128 1.00 18.85 

13.948 36.764 31.591 1.00 12.70 

13.235 37.554 30.977 1.00 14.77 

15.278 36.778 31.458 1.00 15.34 

16.339 36.181 32.282 1.00 16.10 

15.826 37.772 30.530 1.00 16.08 

17.336 37.710 30.790 1.00 17.98 

17.539 36.351 31.399 1.00 23.99 

15.457 37.465 29.077 1.00 15.20 

15.464 38.355 28.228 1.00 10.27 

15.139 36.203 28.794 1.00 11.01 

14.769 35.813 27.437 1.00 10.79 

14.784 34.282 27.247 1.00 8.59 

14.453 33.943 25.792 1.00 10.32 

16.152 33.712 27.617 1.00 7.68 

16.184 32.189 27.604 1.00 6.34 

13.355 36.310 27.145 1.00 9.04 

13.074 36.849 26.070 1.00 9.00 

12.461 36.112 28.107 1.00 10.13 

11.080 36.544 27.951 1.00 12.20 

10.249 36.103 29.157 1.00 9.16 

10.233 34.595 29.436 1.00 10.30 

9.469 34.304 30.717 1.00 9.41 

9.598 33.873 28.268 1.00 11.50 

11.049 38.061 27.824 1.00 13.01 
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20.748 36.914 39.970 1.00 40.16 

24.634 31.190 17.336 1.00 36.87 

5.642 30.898 42.120 1.00 38.57 

8.972 40.592 30.979 1.00 32.13 

2.047 31.605 35.777 1.00 62.75 

27.060 7.939 28.519 1.00 31.51 

4.134 24.143 10.395 1.00 19.77 

17.406 32.729 38.273 1.00 19.77 

21.370 42.268 22.477 1.00 19.75 

23.854 15.724 43.136 1.00 19.76 

19.654 34.836 37.602 1.00 19.76 

21.170 42.930 27.470 1.00 19.75 

25.304 8.005 25.551 1.00 19.75 

20.739 40.152 30.476 1.00 19.73 

19.238 15.779 6.587 1.00 19.76 

7.151 28.097 9.617 1.00 19.75 

7.122 17.869 11.543 1.00 19.75 
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ATOM 


1435 


0 


HOH 


W 


129 


9 . 467 


35 


.418 


37.012 


1.00 


19.76 


S 


O 




ATOM 


1436 


0 


HOH 


w 


130 


5 . 720 


23 


.417 


6.558 


1.00 


19.76 


S 


0 




ATOM 


1437 


O 


HOH 


w 


131 


3 . 123 


12 


.568 


32.283 


1.00 


19.76 


S 


0 




ATOM 


1438 


O 


HOH 


w 


132 


12 . 909 


18 


. 142 


39.232 


1.00 


19.75 


s 


0 


5 


ATOM 


1439 


O 


HOH 


w 


133 


18 . 190 


34 


. 668 


45.077 


1.00 


19.77 


s 


0 




ATOM 


1440 


0 


HOH 


w 


134 


16 . 371 


23 


.490 


8 .743 


1.00 


19.77 


s 


0 




ATOM 


1441- 


0 


HOH 


w 


135 


25 . 889 


26 


.341 


15 .721 


1.00 


19.77 


s 


0 




ATOM 


1442 


0 


HOH 


w 


138 


18 . 831 


37 


.368 


35.694 


1.00 


19.75 


s 


0 




ATOM 


1443 


0 


HOH 


w 


139 


-1 . 837 


27 


.004 


34 . 243 


1 . 00 


19.78 


s 


0 


10 


ATOM 


1444 


0 


HOH 


w 


140 


29.965 


21 


.328 


39.814 


1.00 


19.75 


s 


0 




ATOM 


1445 


0 


HOH 


w 


141 


29 . 084 


22 


. 512 


22.380 


1.00 


19.74 


s 


0 




ATOM 


1446 


o 


HOH 


w 


144 


26 . 825 


34 


. 183 


16.982 


1.00 


19.75 


s 


0 




ATOM 


1447 


0 


HOH 


w 


146 


28 .060 


21 , 


.125 


26.874 


1.00 


19.76 


s 


0 




ATOM 


1448 


0 


HOH 


w 


147 


7 .953 


28 


.465 


43 .320 


1.00 


19.76 


s 


0 


15 


ATOM 


1449 


0 


HOH 


w 


148 


25.139 


13 , 


. 555 


38.510 


1 .00 


19.76 


s 


0 




ATOM 


1450 


0 


HOH 


w 


154 


27. 898 


15 , 


.263 


40.931 


1.00 


19.75 


s 


0 




ATOM 


1451 


0 


HOH 


w 


157 


29.305 


18. 


.029 


39.665 


1.00 


19.76 


- s 


0 




ATOM 


1452 


0 


HOH 


w 


158 


22 . 038 


30. 


.753 


9.108 


1.00 


19.76 


s 


0 




ATOM 


1453 


0 


HOH 


w 


159 


18.399 


11 . 


.163 


36.207 


1.00 


19.76 


s 


o 


20 


ATOM 


1454 


o 


HOH 


w 


164 


26.335 


11 . 


. 937 


35.945 


1.00 


19 .75 


s 


0 




ATOM 


1455 


0 


HOH 


w 


165 


1. 758 


29 . 


. 855 


17.357 


1.00 


19.75 


s 


0 




ATOM 


1456 


0 


HOH 


w 


166 


24 .163 


39. 


.471 


32 .170 


1.00 


19.76 


s 


0 




ATOM 


1457 


0 


HOH 


w 


170 


16 . 077 


17 . 


. 918 


7 .749 


1 .00 


19.75 


s 


0 




ATOM 


1458 


0 


HOH 


w 


172 


32 . 921 


14. 


, 044 


27.295 


1.00 


19.76 


s 


0 


25 


ATOM 


1459 


0 


HOH 


w 


177 


32.795 


38. 


,969 


32 .954 


1 . 00 


19 .77 


s 


0 




ATOM 


1460- 


0 


HOH 


w 


179 


4.059 


6. 


,708 


28.892 


1.00 


19.75 


s 


0 




ATOM 


1461 


o 


HOH 


w 


180 


25.397 


29. 


,865 


14.090 


1.00 


19.76 


s 


0 




ATOM 


1462 


0 


HOH 


w 


182 


11.078 


20. 


731 


43.859 


1.00 


19.77 


s 


0 




ATOM 


1463 


0 


HOH 


w 


184 


30.825 


30. 


779 


39.402 


1.00 


19.77 


s 


0 


30 


ATOM 


1464 


0 


HOH 


w 


187 


10.289 


21. 


108 


7.474 


1.00 


19.75 


s 


0 




ATOM 


1465 


0 


HOH 


w 


189 


27.314 


38. 


906 


38.135 


1.00 


19.76 


s 


0 




ATOM 


1466 


0 


HOH 


w 


197 


25.884 


26. 


959 


11.320 


1.00 


19.70 


s 


0 




ATOM 


1467 


0 


HOH 


w 


209 


9.364 


16. 


866 


38.731 


1.00 


19.73 


s 


0 




ATOM 


1468 


0 


HOH 


w 


219 


32.352 


16. 


134 


38.786 


1.00 


19.73 


s 


o 


35 


ATOM 


1469 


0 


HOH 


w 


221 


15.972 


35. 


898 


37.609 


1.00 


19.69 


s 


0 




ATOM 


1470 


0 


HOH 


w 


223 


3.319 


35. 


758 


13.483 


1.00 


19.71 


s 


0 




TER 


1471 




HOH 


w 


223 



















END 



40 

The surface accessible residues of ASP were determined from the crystallographic 
coordinates provided above, using the program DS Modeling (Accelrys), using the default 
45 settings. The total surface accessibility (SA) for ASP was found to be 8044.777 Angstroms. 
Table 19-2 provides the total SA, side chain SA, and percent SAS is the percentage of an 
amino acid's total surface that is accessible to solvent. 



Table 19-2. Total Surface Accessibility of ASP 



50 



Residue 

asp 1:Phe 
asp 2:Asp 
asp 4:lle 

55 asp 7:Asn 
asp 8:Ala 
asp 10:Thr 
asp 11:lle 
asp 12:Gly 

60 asp 13:Gly 
asp 14:Arg 
asp 15:Ser 
asp 16:Arg 
asp 22:Ala 

65 asp 24:Asn 
asp 25:Gly 
asp 32:His 



Total SA ang 

89.992 

85.970 

17.921 

40.541 

41.497 

35.846 

29.424 

81.658 

75.236 

124.289 

29.424 

105.411 

11.690 

71.105 

53.190 

34.693 



SIdeChain SA 

66.420 
68.625 
12.076 
40.541 
24.153 
35.846 
18.114 
30.191 
18.114 
124.289 
29.424 
88.447 
0.000 
65.067 
30.191 
17.728 



ang Percent SAS 

36.954 
48.199 

9.714 
21.246 
35.259 
21.190 
17.028 
73.513 
67.615 
55.664 
19.554 
38.127 

9.932 
47.079 
43.325 
19.568 
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asp o4:vaiy 


18.114 


12.076 


20.656 




asp 35:Arg 


177.087 


171.242 


69.918 




asp 36:Thr 


87.506 


64.886 


45.401 




asp 37:G!y 


58.465 


24.153 


55.659 


5 


asp 38:Ala 


18.114 


12.076 


16.195 




asp 39:Thr 


99.579 


87.889 


55.002 




asp 40:Thr 


11.310 


0.000 


6.469 




asp 41: Ala 


36.229 


36.229 


38.182 




asp 42:Asn 


86.537 


74.844 


43.919 


10 


asp 43: Pro 


6.038 


0.000 


4.599 




asp 44:Thr 


111 .082 


99.582 


59.375 




asp 45:Gly 


6.038 


6.038 


5.436 




asp 46:Thr 


52.427 


52.427 


28.958 




, asp 47:Phe 


5.655 


0.000 


2.715 


15 


asp 48:Ala 


58.848 


30.191 • 


52.705 




asp 49:Gly 


12.076 


12.076 


12.937 




asp 50:Ser 


51 .274 


0.000 


37.049 




asp 51:Ser 


1 7.348 


17.348 


1 1 .573 




asp 52:Phe 


52.040 


12.076 


25.034 


20 


asp 53: Pro 


53.193 


36.229 


40.51 1 




asp 54:Gly 


30.191 


30.191 


27.274 




asp 55:Asn 


34.499 


34.499 


18.613 




asp 57:Tyr 


28.658 


28.658 


11.861 




asp 59:Phe 


18.1 14 


18.114 


9.808 


25 


asp 61:Arg 


1 46.706 


141.051 


59.429 




asp 62:Thr 


22.619 


5.655 


12.939 




asp 63:G!y 


17.538 


6.038 


17.646 




asp 64:Ata 


1 1 2.229 


60.381 


90.564 




asp 65:Gly 


70.535 


30.191 


60.226 


30 


asp 66:VaI 


16.965 


0.000 


1 0.967 




asp 67:Asn 


69.002 


62.964 


39.692 




asp 68:Leu 


34.503 


6.038 . 


16.536 




asp 69:Leu 


42.267 


42.267 


20.295 




asp 71:GIn 


39.774 


39.774 


18.552 


35 


asp 73:Asn 


17.345 


17.345 


8.760 




asp 74:Asn 


41 .301 


41 .301 


25.351 




asp 75:Tyr 


93.544 


47.922 


37.830 




asp 76:Ser 


97.666 


52.044 


76.965 




asp 77:Gly 


81 .275 


24.153 


73.294 


40 


asp 78:Gly 


17.921 


1 2.076 


18.067 




asp 79:Arg 


139.911 


94.292 


56.632 




asp 80:Va! 


36.229 


30.191 


22.621 




asp 81:Gln 


82.421 


70.921 


37.295 




asp 83:Ala 


41.1 17 


24.153 


33.386 


45 


asp 84:G!y 


1 2.076 


1 2.076 


12.151 




asp 85: His 


71 .298 


65.454 


36.451 




asp 86:Thr 


111 .082 


93.544 


65.51 7 




asp o7:Ala 


64.886 


42.267 


52.523 




asp oo.Aia 


1 2.076 


D.Uoo 


10.760 


50 


asp 89: Pro 


90.572 


78.496 


58.405 




asp 90:vai 


94.694 


DD.4<£U 






asp 9i:Caiy 


58.082 


lo.l 14 


A(\ COO 




asp 92:Ser 


34.886 


23.003 


27.450 




asp 93:Aia 


83.381 


60.381 


70.846 


55 


asp 95:Cys 


26.565 


26.565 


15.773 




asp 99:Ser 


39.584 


0.000 


29.907 




asp 100:Thr 


87.123 


47. loo 


48.121 




asp 101:Thr 


34.696 


6.038 


22.060 




asp 102:Gly 


1 2.076 


1 d.UfO 


13.771 


60 


asp i03:Trp 


70.728 


47.919 


27.630 




asp lU4.nlS 


Hi *f 


ft ft "7 
41 .Do/ 


OO 4 CO 




asp 105:Cys 


54.609 


31.799 


33.796 




asp 106:Gly 


23.386 


12.076 


23.531 




asp 107:Thr 


47.155 


47.155 


29.873 


65 


asp 108: lie 


5.655 


0.000 


2.888 




asp 109:Thr 


64.503 


30.191 


35.741 




asp 110: Ala 


24.153 


24.153 


21 .668 




asp 111:Leu 


71.115 


48.305 


36.142 




asp 112:Asn 


138.770 


104.841 


66.301 


70 


asp 113:Ser 


17.731 


11.693 


12.794 




asp 114:Ser 


92.391 


52.427 


63.967 




asp 115:Val 


30.191 


24.153 


18.166 
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asp 116:Thr 


128.237 


82.618 


66.534 




asp H7:Tyr 


35.846 


24.153 


15.603 




asp 11 8: Pro 


159.964 


102.648 


93.188 




asp 119:Glu 


132.745 


87.123 


63.766 


5 


asp 120:Gly 


18.114 


18.114 


20.611 




asp 121:Thr 


93.924 


76.579 


48.828 




asp 123:Arg 


129.748 


129.748 


59.61 9 




asp 124:Gly 


29.231 


12.076 


26.315 




asp 126:!le 


6.038 


6.038 


3.084 


10 


asp 127:Arg 


99.943 


99.943 


36.957 




asp 128:Thr 


5.655 


0.000 


3.450 




asp 129:Thr 


76.579 


59.615 


45.219 




asp 130:Val 


0.000 


0.000 


0.000 




asp 131:Cys 


25.568 


19.723 


18.583 


15 


asp 132:Ala 


11.693 


6.038 


9.495 




asp 133:Glu 


40.734 


29.041 


20.057 




asp 134:Pro 


114.531 


1 02.648 


68.994 




asp 135:Gly 


11.883 


6.038 


11.979 




asp 137:Ser 


5.655 


5.655 


3.915 


20 


asp 143: Ala 


17.731 


6.038 


18.763 




asp 144:Gly 


59.612 


36.229 


63.599 




asp 145:Asn 


81.832 


70.142 


44.061 




asp 146:Gln 


52.810 


52.810 


27.510 




asp 147:Ala 


5.655 


0.000 


4.797 


25 


asp 148:Gln 


11.500 


5.845 


5.335 




asp 152:Ser 


5.655 


0.000 


4.092 




asp 153:Gly 


24.153 


18.114 


25.819 




asp 154:Gly 


63.927 


12.076 


64.322 




asp 155:Ser 


88.656 


70.541 


69.864 


30 


asp 156:Gly 


52.807 


18.114 


50.090 




asp 157:Asn 


35.263 


35.263 


20.195 




asp 158:Cys 


34.312 


6.038 - 


21.893 




asp 159:Arg 


199.716 


154.094 


79.090 




asp 160:Thr 


135.044 


89.422 


85.862 


35 


asp 161:Gly 


35.462 


24.153 


33.699 




asp 162:Gly 


23.576 


6.038 


21.225 




asp 163:Thr 


46.005 


46.005 


25.438 




asp 164:Thr 


5.655 


5.655 


3.127 




asp 165:Phe 


24.153 


24.153 


10.669 


40 


asp 167:Gln 


5.845 


5.845 


3.042 




asp 168: Pro 


48.305 


48.305 


31.227 




asp 170:Asn 


59.032 


53.377 


31.882 




asp 171: Pro 


59.615 


42.267 


42.027 




asp 1 73: Leu 


17.731 


iZ.U7o 


8.274 


45 


asp 174:Gln 


145.572 


122.569 


80.497 




asp 175:Ala 


52.044 


6.038 


44.291 




asp 176:Tyr 


64.886 


36.229 


29.811 




asp 177:Gly 


69.775 


24.153 


70.340 




asp 178:Leu 


11.693 


6.038 


5.788 


50 


asp 179:Arg 


182.932 


182.932 


72.390 




asp 180:Met 


34.886 


12.076 


17.253 




asp 1 81 :lle 


36.229 


30.191 


19.053 




asp 182:Thr 


99.389 


76.579 


60.785 




asp 183:Thr 


104.854 


93.544 


68.979 


55 


asp 184: Asp 


122.008 


23.386 


52.822 



The ASP co-ordinates, and those of homologous structures were loaded into MOE 
so (Chemical Computing Group). Co-ordinates for waters and ligands were removed. Using 
MOE align, the structures were aligned using actual secondary structure, with structural 
alignment enabled and superpose chains enabled. This resulted in the following structural 
alignment. The numbers indicated refer to the mature ASP protease amirio-acid sequence. 

65 PDB ID 
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1 10 20 30 40 

ASP FDVIGGNAYTIG-GRSRCSIGFAVN GG F I TAGHCGRTG ATTAN PTGTFA 

1HPG — VLGGGAIYGG -GSR- C SAAFNVTK-GGARYFVTAGHCTNI SANWSAS S -GG SWGVRE 

1SGP — I SGGDAI YSS - TGR- C SLGFNVRS -GSTYYFLTAGHCTIX3ATTWWANSARTTVX.GTTS 

5 1TAL ANIVGGI EYS INNASL - C SVGFSVTR-GATKGFVTAGHCGTVNATARIG GAWGTFA 

2 SFA — I AGGEAI YAAGGGR- C SLGFNVRS S SGATYALTAGHCTEI ASTWYTNSGQTSDLGTRA 
2 SGA — I AGGEAITT-GGSR-C SLGFNVSV-NGVAHALTAGHCTNI SASWS IGTRT 

PDB ID 

10 50 60 70 80 90 100 

ASP GSSFPGNDYAFVRTGAG-VNIiIjAQVNNYSGGRVQVAGHTAAPVGSAVCRSGSTTGWHC 

1HPG GTSFPTNDYGIVRYTDG- SS PAGTVDLYNGSTQDI SSAANAVVGQAIKKSGSTTKVTSGT 

1SGP GSSF PNNDYG I VRYTNTT I PKDGTVG GQDITSAANATVGMAVTRRGSTTGTHSGS 

1TAL ARVFPGNDRAWVSLTSA- QTLLPRVANG - SSFVTVRG STEAAVGAAVCRSGRTTGYQCGT 

1 5 2 SFA GTSFPGNDYGLIRHSNA- SAADGRVYLiYNGSYRDITGAGNAYVGQTVQRSGSTTGLHSGR 

2 SGA GTSFPNNDYGIIRHSNP-AAADGRVYIiYNGSYQDITTAGNAFVGQAVQRSGSTTGLRSGS 

PDB ID 

110 120 130 140 150 160 

20 ASP ITALNSSVTYPE-GTVRGLIRTTVCAEPGDSGGSIiIiA-GNQAQGVTSGGSG NCRT 

1HPG VTAVNVTVNYGD-GPVYNMVRTTACSAGGDSGGAHFA-GSVALGIHSGSSG CSG 

1SGP VTALNATVNYGGGDVVYGMIRTNVCAEPGDSGGPLYS-GTRAIGIjTSGGSG NCSS 

1TAL ITAKNVTANYAE-GAVRGLTQGNACMGRGDSGGSWITSAGQAQGVMSGGNVQSNGNNCGI 

2 SFA VTGLNATVNYGGGDIVSGLIQTNVCAEPGDSGGAIiFA-GSTALGLTSGGSG NCRT 

25 2 SGA VTGLNATVNYGSSG IVYGMI QTNVCAQ PGDSGGSLFA-G STALGLTSGGSG NCRT 

PDB ID 

170 180 

ASP G GTTFFQPVNP ILQAYGLRMITTD (SEQ ID NO: 624) 

30 1HPG TA- -GSAIHQPVTEAIiSAYGVTVY (SEQ ID NO: 625) 

1SGP G GTTFFQPVTEALVAYGV SVY (SEQ ID NO: 626) 

1TAL PASQRSSLFERDQPILSQYGLSLVTG- (SEQ ID NO: 627) 

2 SFA G GTTFFQPVTEALSAYGVSIL (SEQ ID NO: 628) 

2 SGA G GTTFYQPVTEALSAYGATVIj (SEQ ID NO: 629) 

35 

In the above alignment, the codes are as follows: 

1HPG = Streptomyces griseus glutamic acid specific protease. 
40 1SGP = Streptomyces griseus proteinase B 
1SGT = Streptomyces griseus strain K1 trypsin 
1TAL = Lysobacter enzymogenes alpha-lytic protease 
2SFA = Streptomyces fradiae serine proteinase 
2SGA = Streptomyces griseus protease A 

45 



EXAMPLE 20 

so Enzyme Substrate Modeling and Mapping of the ASP Active-Site 

In this Example, enzyme-substrate modeling and mapping of the ASP active site 
methods are described. Preliminary inspection of the active-site revealed a large P1 binding 
pocket that is large enough to accommodate large hydrophobic groups such as the side- 
chains of Trp, Tyr, and Phe. 
55 The crystal structure of Streptogrisin A with the turkey third domain of the ovomucoid 

inhibitor (pdb code 2SGB) was been determined. 2SGB was structurally aligned to ASP, 
using MOE (Chemical Computing Corp), which places the inhibitor in the active-site of ASP. 
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All of the 2SGB co-ordinates were removed, except for those which define a hexa-peptide 
bound in the ASP active-site, corresponding to binding at the S4 to S2' binding sites. The 
Pro-ASP protein self-cleaves the pro domain-mature domain junction, to release the mature 
protease enzyme. The last four residues of the pro domain are expected to occupy the S1- 
S4 sites, and the first two residues of the mature protease occupy the S1' and S2' sites. 
Therefore the hexapeptide in the active-site was in-silico mutated to sequence PRTMFD 
(SEQ ID NO:630). 

From inspection of the structure of the initial substrate bound model, the backbone 
amide of Glyl35 and Asp136 would be expected to form the oxy-anion hole. However, the 
amide nitrogen of Gly135 appears to point in the wrong direction. Comparison with 
streptogrisin A confirms this. Thus, it is presumed that a conformational change in ASP is 
required to form the oxy-anion hole. However, it is not intended that the present invention 
be limited by any particular mechanism nor hypothesis. The peptide backbone between 
residues 134 and 135 was altered to that of a similar orientation to that of structurally 
equivalent atoms in the streptogrisin A structure. The enzyme substrate model was then 
energy minimized. 

Residues within 6 A of the modeled substrate were determined using the proximity 
tools within the program QUANTA. These residues were identified as: Arg14, Ser15, 
Arg16, Cys17, His32, Cys33, Phe52, Asp56, ThMOO, Val115, Thr116, Tyr117, Pro118 f 
Glu119, Ala132, Glu133, Pro134, Gly135, Asp136, Ser137, Thr151, Ser152, Gly153, 
Gly154, Ser155, Gly156, Asn157, Thr164, Phe165. Of these, His 32, Asp56, and Ser137 
form the catalytic triad. 

The P1 pocket is formed by Cys131, Ala132, Glu133, Pro134, Gly135, Thr151, 
Ser1 52, Gly153, Gly154, Ser155, Gly156, Asn157 and Gly 162, Thr 163, Thr164. The P2 
pocket is defined by Phe52, Tyr117, Pro118 and Glu119. The P3 pocket has main-chain to 
main chain hydrogen bonding from Gly 154 to the substrate main-chain. The P1' pocket is 
defined by Arg16, and His32. The P2' pocket is defined by ThMOO, and Pro134. The 
atomic coordinates of ASP with the modeled octapeptide substrate are provided in Table 20- 
1 below. 



Table 20-1. Atomic Coordinates of ASP with the Modeled Octapeptide Substrate 



ATOM 


1 


N 


PHE 


A 


1 


2.452 


18.495 


15.165 


0.00 


N1+ 


ATOM 


2 


CA 


PHE 


A 


1 


3.712 


18.208 


15.901 


0.00 


C 


ATOM 


3 


CB 


PHE 


A 


1 


4.906 


18.646 


15.055 


0.00 


c 


ATOM 


4 


C 


PHE 


A 


1 


3.743 


18.914 


17.254 


0.00 


c 


ATOM 


5 


O 


PHE 


A 


1 


3.539 


20.133 


17.340 


0.00 


0 


ATOM 


6 


CG 


PHE 


A 


1 


6.232 


18.405 


15.707 


0.00 


c. 


ATOM 


7 


CD2 


PHE 


A 


1 


6.963 


17.268 


15.411 


0.00 


c 
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ATOM 


8 


CD1 PHE A 


1 


6.750 


19.312 


16.618 


0.00 


c 




ATOM 


9 


CE2 PHE A 


1 


8.192 


17.035 


16.010 


0.00 


C 




ATOM 


10 


CE1 PHE A 


1 


7 .981 


19.086 


17.222 


0.00 


C 




ATOM 


11 


CZ PHE A 


1 


8.702 


17.946 


16.917 


0.00 


C 


5 


ATOM 


12 


N ASP A 


2 


4 . 000 


18.148 


18.311 


0.00 


N 




ATOM 


13 


CA ASP A 


2 


4.052 


18.708 


19.659 


0.00 


c 




ATOM 


14 


CB ASP A 


2 


3.584 


17.678 


20.688 


0.00 


c 




ATOM 


15 


C ASP A 


2 


5.422 


19.210 


20.066 


0.00 


c 




ATOM 


16 


O ASP A 


2 


6 .415 


18.508 


19.925 


0.00 


O 


10 


ATOM 


17 


CG ASP A 


2 


2.109 


17.354 


20.560 


0.00 


c 




ATOM 


18 


OD2 ASP A 


2 


1.597 


16.558 


21.379 


0.00 


Ol- 




ATOM 


19 


OD1 ASP A 


2 


1.459 


17.889 


19.638 


0.00 


0 




ATOM 


20 


N VAIi A 


3 


5.464 


20.440 


20.562 


0.00 


N 




ATOM 


21 


CA VAL A 


3 


6.707 


21.057 


21.009 " 


0.00 


c 


15 


ATOM 


22 


CB VAL A 


3 


6.736 


22.574 


20.718 


0.00 


c 




ATOM 


23 


C VAL A 


3 


6.737 


20.837 


22.513 


0.00 


c 




ATOM 


24 


O VAL A 


3 


5.806 


21.233 


23.216 


0.00 


0 




ATOM 


25 


CGI VAL A 


3 


7.921 


23.222 


21.425 


0.00 


c 




ATOM 


26 


CG2 VAL A 


3 


6.840 


22.810 


19.220 


0.00 


c 


20 


ATOM 


27 


CB ILE A 


4 


7.602 


18.448 


24.730 


0.00 


c 




ATOM 


28 


CG2 ILE A 


4 


7.684 


18.189 


26.227 


0.00 


c 




ATOM 


29 


CGI ILE A 


4 


6.196 


18.137 


24.220 


0.00 


c 




ATOM 


30 


CD1 ILE A 


4 


5.768 


16.711 


24.456 


0.00 


c 




ATOM 


31 


C ILE A 


4 


9.379 


20.168 


24.911 


0.00 


c 


25 


ATOM 


32 


O ILE A 


4 


10.346 


19.836 


24.229 


0.00 


o 




ATOM 


33 


N ILE A 


4 


7.801 


20.200 


22.997 


0.00 


N 




ATOM 


34 


CA ILE A 


4 


7.955 


19.916 


24.423 


0.00 


c 




ATOM 


35 


N GLY A 


5 


9.499 


20.743 


26.103 


0.00 


N 




ATOM 


36 


CA GLY A 


5 


10.807 


21.030 


26.653 


0.00 


c 


30 


ATOM 


37 


C GLY A 


5 


11.655 


19.787 


26.819 


0.00 


c 




ATOM 


38 


O GLY A 


5 


11.171 


18.750 


27.277 


0.00 


o 




ATOM 


' 39 


N GLY A 


6 


12.927 


19.885 


26.443 


0.00 


N 




ATOM 


40 


CA GLY A 


6 


13.817 


18.747 


26.572 


0.00 


c 




ATOM 


41 


C GLY A 


6 


14.007 


17.948 


25.294 


0.00 


c 


35 


ATOM 


42 


O GLY A 


6 


14.990 


17.217 


25.157 


0.00 


o 




ATOM 


43 


N ASN A 


7 


13.069 


18.082 


24.359 


0.00 


N 




ATOM 


44 


CA ASN A 


7 


13.155 


17.351 


23.100 


0.00 


c 




ATOM 


45 


CB ASN A 


7 


11.784 


17.247 


22.450 


0.00 


c 




ATOM 


46 


CG ASN A 


7 


10.918 


16.210 


23.1.02 


0.00 


c 


40 


ATOM 


47 


OD1 ASN A 


7 


9.741 


16.069 


22.760 


0.00 


o 




ATOM 


48 


ND2 ASN A 


7 


11.492 


15.464 


24.049 


0.00 


N 




ATOM 


49 


C ASN A 


7 


14.124 


17.933 


22.086 


0.00 


c 




ATOM 


50 


O ASN A 


7 


14.466 


19.114 


22.119 


0.00 


0 




ATOM 


51 


N ALA A 


8 


14.561 


17.077 


21.176 


0.00 


N 


45 


ATOM 


52 


CA ALA A 


8 


15.486 


17.487 


20.138 


0.00 


c 




ATOM 


53 


CB ALA A 


8 


16.212 


16.271 


19.577 


0.00 


c 




ATOM 


54 


C ALA A 


8 


14 . 716 


18.174 


19.023 


0.00 


c 




ATOM 


55 


O ALA A 


8 


13 .509 


17.988 


18.874 


0.00 


o 




ATOM 


56 


N TYR A 


9 


15.423 


18.993 


18.262 


0.00 


N 


50 


ATOM 


57 


CA TYR A 


9 


14 .847 


19.714 


17.143 


0.00 


c 




ATOM 


58 


CB TYR A 


9 


14.253 


21.064 


17.580 


0.00 


c 




ATOM 


59 


CG TYR A 


9 


15.221 


22.148 


17.963 


0.00 


c 




ATOM 


60 


CD2 TYR A 


9 


15.517 


22.398 


19 .301 


0.00 


c 




ATOM 


61 


CE2 TYR A 


9 


16.341 


23 . 443 


19.663 


0.00 


c 


55 


ATOM 


62 


CD1 TYR A 


9 


15.785 


22.972 


16.993 


0.00 


c 




ATOM 


63 


CE1 TYR A 


9 


16.609 


24.021 


17.343 


0.00 


c 




ATOM 


64 


CZ TYR A 


9 


16.883 


24.255 


18.678 


0.00 


c 




ATOM 


65 


OH TYR A 


9 


17.688 


25.309 


19.029 


0.00 


o 




ATOM 


66 


C TYR A 


9 


16.072 


19.837 


16.262 


0.00 


c 


60 


ATOM 


67 


O TYR A 


9 


17.188 


19.678 


16.753 


0.00 


0 




ATOM 


68 


N THR A 


10 


15.886 


20.077 


14.970 


0.00 


N 




ATOM 


69 


CA THR A 


10 


17.034 


20.183 


14.082 


0.00 


c 




ATOM 


70 


CB THR A 


10 


17 . 031 


19.031 


13 .041 


0.00 


c 




ATOM 


71 


OG1 THR A 


10 


15. 822 


19. 082 


12.269 


0.00 


0 


65 


ATOM 


72 


CG2 THR A 


10 


17.129 


17.676 


13.741 


0.00 


c 




ATOM 


16 


C THR A 


10 


17 . 205 


21 .488 


13 .329 


0. 00 


c 




ATOM 


74 


O THR A 


10 


16.249 


22.243 


13.104 


0.00 


0 




ATOM 


75 


N ILE A 


11 


18.453 


21.734 


12.938 


0.00 


N 




ATOM 


76 


CA ILE A 


11 


18.828 


22.930 


12.197 


0.00 


c 


70 


ATOM 


77 


CB ILE A 


11 


19.609 


23.914 


13.093 


0.00 


c 




ATOM 


78 


CG2 ILE A 


11 


19-855 


25.221 


12.343 


0.00 


c 




ATOM* 


79 


CGI ILE A 


11 


18.811 


24.187 


14.369 


0.00 


c 




ATOM 


80 


CD1 ILE A 


11 


19.546 


25.036 


15.385 


0.00 


c 




ATOM 


81 


C ILE A 


11 


19.712 


22.442 


11.054 


0.00 


c 


75 


ATOM 


82 


O ILE A 


11 


20.772 


21.856 


11.284 


0.00 


0 




ATOM 


83 


N GLY A 


12 


19.274 


22.668 


9.821 


0.00 


N 




ATOM 


84 


CA GLY A 


12 


20.048 


22.193 


8.689 


0.00 


c 




ATOM 


85 


C GLY A 


12 


20.344 


20.705 


8.845 


0.00 


c 
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86 


O 


GLY A 


12 


21 . 439 




a coo 


0 . 00 


u 




ATOM 


87 


N 


GIiY A 


13 


19.373 


19 . 957 


9 .361 


0. 00 


N 




ATOM 


88 


CA 


GLY A 


13 


19.564 


18 . 531 


9 . 545 


0. 00 


c 




ATOM 


89 


C 


GLY A 


13 


20.373 


18 . 127 


10 . 769 


0. 00 


c 


5 


ATOM 


90 


o 


GLY A 


13 


20.438 


16 . 945 


11 . 103 


0. 00 


o 




ATOM 


91 


N 


ARG A 


14 


20.984 


19 . 091 


11 .449 


0.00 


N 




ATOM 


92 


CA 


ARG A 


14 


21.787 


18 . 782 


12 . 627 


0 . 00 


c 




ATOM 


93 


CB 


ARG A 


14 


23 . 036 


19 . 670 


12 . 669 


0 . 00 


c 




ATOM 


94 


C 


ARG A 


14 


21.018 


18 . 938 


13 . 935 


0 . 00 


c 


10 


ATOM 


95 


o 


ARG A 


14 


20.441 


19 .982 


14 .212 


0. 00 


o 




ATOM 


96 


CG 


ARG A 


14 


24.251 


19 . 072 


11 .964 


0. 00 


c 




ATOM 


97 


CD 


ARG A 


14 


24.065 


19 . 084 


10 .450 


0.00 


c 




ATOM 


98 


NE 


ARG A 


14 


24.173 


17 . 752 


9 . 858 


0.00 


N1 + 




ATOM 


99 


CZ 


ARG A 


14 


25.316 


17 . 100 


9 . 660 


0 . 00 


c 


15 


ATOM 


100 


NH1 


ARG A 


14 


26.474 


17 . 655 


10 . 004 


0 . 00 


N 




ATOM 


101 


NH2 


ARG A 


14 


25.302 


15 . 886 


9 . 120 


0. 00 


N 




ATOM 


102 


N 


SER A 


15 


21 . 016 


17 . 878 


14 . 733 


0. 00 


N 




ATOM 


103 


CA 


SER A 


15 


20.335 


17 . 870 


16 . 017 


0 . 00 


c 




ATOM 


104 


CB 


SER A 


15 


20. 062 


16 .429 


16 . 454 


0 . 00 


c 


20 


ATOM 


105 


C 


SER A 


15 


21.312 


18 . 525 


16 .983 


0.00 


c 




ATOM 


106 


o 


SER A 


15 


21 . 933 


17 . 849 


17 . 803 


0 . 00 


o 




ATOM 


107 


OG 


SER A 


15 


19.396 


16 .382 


17 . 701 


0. 00 


0 




ATOM 


108 


N 


ARG A 


16 


21 .454 


19 . 841 


16 . 867 


0.00 


N 




ATOM 


109 


CA 


ARG A 


16 


22.362 


20. 594 


17 . 724 


0. 00 


c 


25 


ATOM 


110 


CB 


ARG A 


16 


22 . 741 


21 . 927 


17 . 073 


0. 00 


c 




ATOM 


111 


C 


ARG A 


16 


21 . 815 


20 . 907 


19 .104 


0 . 00 


c 




ATOM 


112 


O 


ARG A 


16 


22.550 


20 . 867 


20 . 088 


0 . 00 


0 




ATOM 


113 


CG 


ARG A 


16 


23.719 


21 . 851 


15 . 915 


0.00 


c 




ATOM 


114 


CD 


ARG A 


16 


24 . 200 


23 . 253 


15 . 549 


0 . 00 


c 


30 


ATOM 


115 


NE 


ARG A 


16 


24.625 


23 . 984 


16 .745 


0.00 


N1 + 




ATOM 


116 


CZ 


ARG A 


16 


25.242 


25.166 


16 .739 


0.00 


c 




ATOM 


117 


NH2 


ARG A 


16 


25.581 


25.735 


17 . 888 


0.00 


N 




ATOM 


118 


NH1 


ARG A 


16 


25. 528 


25. 781 


15 .597 


0.00 


N 




ATOM 


119 


N 


CYS A 


17 


20.526 


21.215 


19.178 


0 . 00 


N 


35 


ATOM 


120 


CA 


CYS A 


17 


19-928 


21.546 


20 .455 


0. 00 


c 




ATOM 


121 


CB 


CYS A 


17 


19 . 800 


23 . 068 


20 .553 


0.00 


c 




ATOM 


122 


C 


CYS A 


17 


18.599 


20 . 911 


20 . 803 


0. 00 


c 




ATOM 


123 


O 


CYS A 


17 


18. 071 


20 . 077 


20.071 


0.00 


o 




ATOM 


124 


SG 


CYS A 


17 


21.393 


23 . 932 


20 . 696 


0 . 00 


S 


40 


ATOM 


125 


N 


SER A 


18 


18 . 066 


21 . 348 


21 . 942 


0 . 00 


N 




ATOM 


126 


CA 


SER A 


18 


16.799 


20.865 


22 .455 


0.00 


c 




ATOM 


127 


CB 


SER A' 


18 


17 . 042 


20. 053 


23 .723 


0. 00 


c 




ATOM 


128 


OG 


SER A 


18 


18. 081 


19 .111 


23 . 521 


0. 00 


o 




ATOM 


129 


C 


SER A 


18 


15. 871 


22 . 030 


22 .769 


0 . 00 


c 


45 


ATOM 


130 


O 


SER A 


18 


16.312 


23 .175 


22 . 890 


0 . 00 


o 




ATOM 


131 


N 


ILE A 


19 


14 . 584 


21 . 728 


22 . 892 


0 . 00 


N 




ATOM 


132 


CA 


ILE A 


19 


13 . 582 


22 . 737 


23 . 195 


0 . 00 


c 




ATOM 


133 


CB 


ILE A 


19 


12 . 150 


22 .152 


23 . 125 


0 . 00 


c 




ATOM 


134 


CG2 


ILE A 


19 


11 . 133 


23 .215 


23 . 532 


0 . 00 


c 


50 


ATOM 


135 


CGI 


ILE A 


19 


11 . 852 


21 . 634 


21 . 715 


0 . 00 


c 




ATOM 


136 


CD1 


ILE A 


19 


11 . 832 


22 .709 


20 . 655 


0 . 00 


c 




ATOM 


137 


C 


ILE A 


19 


13 .794 


23 .273 


24 . 614 


0 . 00 


c 




ATOM 


138 


O 


ILE A 


19 


14 . 070 


22 . 505 


25 . 545 


0 . 00 


o 




ATOM 


139 


N 


GLY A 


20 


13 . 670 


24 . 589 


24 . 774 


0 . 00 


N 


55 


ATOM 


140 


CA 


GLY A 


20 


13 . 818 


25 .185 


26 . 088 


0 . 00 


c 




ATOM 


141 


C 


GLY A 


20 


12 .443 


25 .203 


26 . 722 


0. 00 


c 




ATOM 


142 


O 


GLY A 


20 


12 .122 


24 .389 


27 . 585 


0. 00 


o 




ATOM 


143 


N 


PHE A 


21 


11 . 616 


26 .137 


26 . 274 


0 . 00 


N 




ATOM 


144 


CA 


PHE A 


21 


10 .253 


26 .258 


26 .763 


0 . 00 


c 


60 


ATOM 


145 


CB 


PHE A 


21 


10 .196 


27 . 160 


27 • 992 


0 . 00 


c 




ATOM 


146 


CG 


PHE A 


21 


10 . 855 


26 .559 


29 . 195 


0 . 00 


c 




ATOM 


147 


CD1 


PHE A 


21 


10.269 


25 .491 


29.857 


0 . 00 


c 




ATOM 


148 


CD2 


PHE A 


21 


12 . 086 


27 . 025 


29 . 638 


0 . 00 


c 




ATOM 


149 


CE1 


PHE A 


21 


10.898 


24 . 898 


30 . 936 


0 . 00 


c 


65 


ATOM 


150 


CE2 


PHE A 


21 


12 . 713 


26 . 435 


30 . 715 


0 . 00 


c 




ATOM 


1 CI 






91 




a3 . J f SJ 


OX • ODD 


0 . 00 


c 




ATOM 


152 


c 


PHE A 


21 


9.391 


26.825 


25.664 


o!oo 


c 




ATOM 


153 


O 


PHE A 


21 


9.865 


27.597 


24.830 


0.00 


o 




ATOM 


154 


N 


ALA A 


22 


8.131 


26.413 


25.646 


0.00 


N 


70 


ATOM 


155 


CA 


ALA A 


22 


7.194 


26.882 


24.647 


0.00 


c 




ATOM 


156 


CB 


ALA A 


22 


6.014 


25.915 


24.533 


0.00 


c 




ATOM 


157 


C 


ALA A 


22 


6.719 


28.230 


25.138 


0.00 


c 




ATOM 


158 


O 


ALA A 


22 


6.416 


28.388 


26.320 


0.00 


0 




ATOM 


159 


N 


VAL A 


23 


6.677 


29.202 


24.239 


0.00 


N 


75 


ATOM 


160 


CA 


VAL A 


23 


6.233 


30.546 


24.582 


0.00 


c 




ATOM 


161 


CB 


VAL A 


23 


7.402 


31.570 


24.551 


0.00 


c 




ATOM 


162 


CGI 


VAL A 


23 


8.328 


31.338 


25.728 


0.00 


c 




ATOM 


163 


CG2 


VAL A 


23 


8.182 


31.442 


23.248 


0.00 


c 
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ATOM 


164 


C 


VAL A 


23 


5 .206 


30. 945 


23 . 545 


0 .00 


C 




ATOM 


165 


O 


VAL A 


23 


5.053 


30.267 


22 .526 


0.00 


O 




ATOM 


166 


N 


ASN A 


24 


4.495 


32.036 


23 .791 


0.00 


N 




ATOM 


167 


CA 


ASN A 


24 


3 .492 


32.476 


22. 832 


0.00 


c 


5 


ATOM 


168 


CB 


ASN A 


24 


2.807 


33 .759 


23.328 


0.00 


c 




ATOM 


169 


C 


ASN A 


24 


4.177 


32 .715 


21.484 


0.00 


c 




ATOM 


170 


O 


ASN A 


24 


5.050 


33 .576 


21 .365 


0. 00 


O 




ATOM 


171 


CG 


ASN A 


24 


3 .737 


34. 963 


23 .334 


0.00 


c 




ATOM 


172 


OD1 


ASN A 


24 


4.697 


35. 029 


24.107 


0.00 


o 


10 


ATOM 


173 


ND2 


ASN A 


24 


3 .451 


35. 927 


22.462 


0 .00 


N 




ATOM 


174 


N 


GLY A 


25 


3 .801 


31.929 


20.477 


0.00 


N 




ATOM 


175 


CA 


GLY A 


25 


4.396 


32.084 


19.158 


0.00 


c 




ATOM 


176 


C 


GLY A 


25 


5.503 


31.104 


18.788 


0.00 


c 




ATOM 


177 


O 


GLY A 


25 


5.925 


31 . 054 


17. 635 


0.00 


o 


15 


ATOM 


178 


N 


GLY A 


26 


5.989 


30.327 


19.748 


0.00 . 


N 




ATOM 


179 


CA 


GLY A 


26 


7.043 


29.377 


19.433 


0.00 


c 




ATOM 


180 


C 


GLY A 


26 


7.702 


28.795 


20.666 


0.00 


c 




ATOM 


181 


O 


GLY A 


26 


7.028 


28.328 


21.582 


0.00 


0 




ATOM 


182 


N 


PHE A 


27 


9.028 


28.813 


20.688 


0.00 


N 


20 


ATOM 


183 


CA 


PHE A 


27 


9.757 


28.294 


21.832 


0.00 


c 




ATOM 


184 


CB 


PHE A 


27 


9.973 


26.783 


21 .710 


0.00 


c 




ATOM 


185 


C 


PHE A 


27 


11.103 


28.975 


21.954 


0.00 


c 




ATOM 


186 


O 


PHE A 


27 


11.660 


29.459 


20.963 


0.00 


o 




ATOM 


187 


CG 


PHE A 


27 


10.949 


26.376 


20. 624 


0.00 


c 


25 


ATOM 


188 


CD1 


PHE A 


27 


10.504 


26. 078 


19 .336 


0.00 


c 




ATOM 


189 


CD2 


PHE A 


27 


12.306 


26.246 


20.905 


0.00 


c 




ATOM 


190 


CE1 


PHE A 


27 


11.391 


25. 650 


18.352 


0.00 


c 




ATOM 


191 


CE2 


PHE A 


27 


13 .202 


25. 819 


19.926 


0.00 


c 




ATOM 


192 


CZ 


PHE A 


27 


12.742 


25.518 


18.648 


0.00 


c 


30 


ATOM 


193 


N 


ILE A 


28 


11.615 


29.020 


23.180 


0.00 


N 




ATOM 


194 


CA 


ILE A 


28 


12.904 


29.640 


23 .445 


0.00 


c 




ATOM 


195 


CB 


ILE A 


28 


12.843 


30.524 


24.704 


0.00 


c 




ATOM 


196 


C 


ILE A 


28 


13.953 


28.542 


23.603 


0.00 


c 




ATOM 


197 


O 


ILE A 


28 


13.640 


27.426 


24.011 


0.00 


0 


35 


ATOM 


198 


CG2 


ILE A 


28 


11.915 


31.688 


24.450 


0.00 


c 




ATOM 


199 


CGI 


ILE A 


28 


12.350 


29.718 


25.904 


0.00 


c 




ATOM 


200 


CD1 


ILE A 


28 


12.270 


30.524 


27.176 


0.00 


c 




ATOM 


201 


N 


THR A 


29 


15.195 


28.866 


23 r .265 


0.00 


N 




ATOM 


202 


CA 


THR A 


29 


16.293 


27.916 


23.353 


0.00 


c 


40 


ATOM 


203 


CB 


THR A 


29 


16.329 


27.054 


22.052 


0.00 


c 




ATOM 


204 


OG1 


THR A 


29 


17 .423 


26.126 


22.095 


0.00 


o 




ATOM 


205 


CG2 


THR A 


29 


16.459 


27.950 


20.831 


0. 00 


c 




ATOM 


206 


C 


THR A 


29 


17 . 601 


28.695 


23.538 


0.00 


c 




ATOM 


207 


O 


THR A 


29 


17.565 


29.881 


23.842 


0. 00 


o 


45 


ATOM * 


208 


N 


ALA A 


30 


18.743 


28. 029 


23.362 


0.00 


N 




ATOM 


209 


CA 


ALA A 


30 


20. 059 


28. 662 


23 .510 


0.00 


c 




ATOM 


210 


CB 


ALA A 


30 


21 .121 


27. 601 


23 .765 


0.00 


c 




ATOM 


211 


C 


ALA A 


30 


20.447 


29. 486 


22 .282 


0.00 


c 




ATOM 


212 


O 


ALA A 


30 


20.232 


29. 061 


21.141 


0 .00 


o 


50 


ATOM 


213 


N 


GLY A 


31 


21 .028 


30. 659 


22 .520 


0 .00 


N 




ATOM 


214 


CA 


GLY A 


31 


21.427 


31.522 


21.423 


0.00 


c 




ATOM 


215 


C 


GLY A 


31 


22.508 


30. 942 


20.528 


0. 00 


c 




ATOM 


216 


O 


GLY A 


31 


22.527 


31 .212 


19 .322 


0 . 00 


O 




ATOM 


217 


N 


HIS A 


32 


23 .410 


30. 143 


21 .099 


0.00 


N 


55 


ATOM 


218 


CA 


HIS A 


32 


24 .490 


29. 558 


20.310 


0.00 


c 




ATOM 


219 


CB 


HIS A 


32 


25.648 


29. 091 


21.215 


0.00 


c 




ATOM 


220 


CG 


HIS A 


32 


25.412 


27 . 772 


21. 885 


0 . 00 


c 




ATOM 


221 


CD2 


HIS A 


32 


24.715 


27 .451 


23 .001 


0 . 00 


c 




ATOM 


222 


ND1 


HIS A 


32 


25.946 


26 .589 


21 .419 


0 . 00 


N 


60 


ATOM 


223 


CE1 


HIS A 


32 


25 .590 


25. 601 


22 .218 


0.00 


c 




ATOM 


224 


NE2 


HIS A 


32 


24 .842 


26. 098 


23 .188 


0 . 00 


N 




ATOM 


225 


C 


HIS A 


32 


24 .029 


28. 401 


19.413 


0 . 00 


c 




ATOM 


226 


O 


HIS A 


32 


24 . 805 


27 . 870 


18.630 


0 . 00 


O 




ATOM 


227 


N 


CYS A 


33 


22.762 


28. 025 


19 .525 


0 . 00 


N 


65 


ATOM 


228 


CA 


CYS A 


33 


22.210 


26. 940 


18.723 


0 . 00 


c 




ATOM 


229 


CB 


CYS A 


33 


ZU . oo o 


26 • 522 


19 . 251 


0 . 00 


c 




ATOM 


230 


SG 


CYS A 


33 


20.853 


25.876 


20.942 


0.00 


s 




ATOM 


231 


C 


CYS A 


33 


22.062 


27.395 


17.283 


0.00 


c 




ATOM 


232 


O 


CYS A 


33 


22.149 


26.603 


16.356 


0.00 


o 


70 


ATOM 


233 


N 


GLY A 


34 


21.822 


28.680 


17.095 


0.00 


N 




ATOM 


234 


CA 


GLY A 


34 


21.664 


29.181 


15.749 


0.00 


c 




ATOM 


235 


C 


GLY A 


34 


21.360 


30.656 


15.763 


0.00 


c 




ATOM 


236 


O 


GLY A 


34 


20.984 


31.213 


16.794 


0.00 


O 




ATOM 


237 


N 


ARG A 


35 


21.523 


31.288 


14.608 


0.00 


N 


75 


ATOM 


238 


CA 


ARG A 


35 


21.284 


32.716 


14.478 


0.00 


c 




ATOM 


239 


CB 


ARG A 


35 


22.417 


33.355 


13.680 


0.00 


c 




ATOM 


240 


C 


ARG A 


35 


19.951 


33.012 


13.798 


0.00 


c 




ATOM 


241 


o 


ARG A 


35 


19.348 


32.138 


13.173 


0.00 


0 
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ATOM 


242 


CG 


ARG A 


35 


22 . 437 


32 . 937 


12'. 219 


0. 00 


c 




ATOM 


243 


CD 


ARG A 


35 


23 . 488 


33 .715 


11.458 


0. 00 


c 




ATOM 


244 


NE 


ARG A 


35 


24 . 832 


33 .237 


11.755 


0.00 


N1+ 




ATOM 


245 


cz 


ARG A 


35 


25.406 


32 .207 


11.139 


0.00 


c 


5 


ATOM 


246 


NH1 


ARG A 


35 


26. 634 


31.832 


11.471 


0.00 


N 




ATOM 


247 


NH2 


ARG A 


35 


24 .759 


31 .559 


10.178 


0.00 


N 




ATOM 


248 


N 


THR A 


36 


19.513 


34.258 


13 .918 


0.00 


• N 




ATOM 


249 


CA 


THR A 


36 


18 .259 


34. 714 


13 .335 


0. 00 


c 




ATOM 


250 


CB 


THR A 


36 


18.124 


36.242 


13 . 522 


0.00 


c 


10 


ATOM 


251 


C 


THR A 


36 


18.161 


34.353 


11 . 856 


0.00 


c 




ATOM 


252 


O 


THR A 


36 


19.123 


34.512 


11. 099 


0.00 


0 




ATOM 


253 


OG1 


THR A 


36 


18.120 


36.536 


14. 923 


0.00 


0 




ATOM 


254 


CG2 


THR A 


36 


16. 844 


36.773 


12.880 


0.00 


c 




ATOM 


255 


N 


GLY A 


37 


16.999 


33.855 


11.449 


0.00 


N 


15 


ATOM 


256 


CA 


GLY A 


37 


16 . 813 


33 .479 


10.059 


0.00 


c 




ATOM 


257 


C 


GLY A 


37 


17 . 046 


32.001 


9.799 


0.00 


c 




ATOM 


258 


O 


GLY A 


37 


16.521 


31.451 


8.839 


0.00 


0 




ATOM 


259 


N 


ALA A 


38 


17.842 


31.349 


10. 640 


0.00 


N 




ATOM 


260 


CA 


ALA A 


38 


18.095 


29. 924 


10.470 


0.00 


c 


20 


ATOM 


261 


C 


ALA A 


38 


16.745 


. 29.222 


10.565 


0.00 


c 




ATOM 


262 


O 


ALA A 


38 


15. 881 


29.657 


11 .324 


0.00 


0 




ATOM 


263 


CB 


ALA A 


38 


19 . 026 


29.426 


11.566 


0.00 


c 




ATOM 


264 


N 


THR A 


39 


16. 553 


28.151 


9.800 


0.00 


N 




ATOM 


265 


CA 


THR A 


39 


15. 281 


27.432 


9.842 


0.00 


c 


25 


ATOM 


266 


CB 


THR A 


39 


14 .779 


27.066 


8.425 


0.00 


c 




ATOM 


267 


OG1 


THR A 


39 


15. 582 


26.012 


7.887 


0.00 


0 




ATOM 


268 


CG2 


THR A 


39 


14. 857 


28.277 


• 7.504 


0.00 


c 




ATOM 


269 


C 


THR A 


39 


15.433 


26.157 


10.664 


0.00 


c 




ATOM 


270 


O 


THR A 


39 


16.533 


25.637 


10.821 


0.00 


0 


30 


ATOM 


271 


N 


THR A 


40 


14.328 


25.649 


11.186 


0.00 


N 




ATOM 


272 


CA 


THR A 


40 


14.382 


24.437 


11.990 


0.00 


c 




ATOM 


273 


CB 


THR A 


40 


14.143 


24.753 


13.473 


0.00 


c 




ATOM 


274 


OG1 


THR A 


40 


12.807 


25.242 


13.636 


0.00 


0 




ATOM 


275 


CG2 


THR A 


40 


15.124 


25.799 


13.962 


0.00 


c 


35 


ATOM 


276 


C 


THR A 


40 


13.332 


23.421 


11 .581 


0.00 


c 




ATOM 


277 


O 


THR A 


40 


12.345 


23.760 


10.927 


0.00 


0 




ATOM 


278 


N 


ALA A 


41 


13.546 


22.178 


11.994 


0.00 


N 




ATOM 


279 


CA 


ALA A 


41 


12.629 


21.084 


11.698 


0.00 


c 




ATOM 


280 


C 


ALA A 


41 


12 .368 


20.368 


13 .030 


0.00 


c 


40 


ATOM 


281 


O 


ALA A 


41 


13.211 


20.394 


13 .936 


0.00 


0 




ATOM 


282 


CB 


ALA A 


41 


13 .247 


20.133 


10.684 


0.00 


c 




ATOM 


283 


N 


ASN A 


42 


11. 206 


19 .734 


13 .149 


0.00 


N 




ATOM 


284 


CA 


ASN A 


42 


10. 839 


19 .022 


14.370 


0.00 


c 




ATOM 


285 


C 


ASN A 


42 


11. 037 


19.959 


15.555 


0.00 


c 


45 


ATOM 


286 


O 


ASN A 


42 


11 . 861 


19.693 


16 .424 


0.00 


o 




ATOM 


287 


CB 


ASN A 


42 


11. 720 


17.780 


14 .584 


0.00 


c 




ATOM 


288 


CG 


ASN A 


42 


11. 686 


16.812 


13 .408 


0.00 


c 




ATOM 


289 


OD1 


ASN A 


42 


10. 687 


16.713 


12.695 


0.00 


0 




ATOM 


290 


ND2 


ASN A 


42 


12.779 


16. 076 


13 .217 


0. 00 


N 


50 


ATOM 


291 


N 


PRO A 


43 


10.258 


21 .046 


15.635 


0.00 


N 




ATOM 


292 


CA 


PRO A 


43 


9.206 


21.493 


14 .718 


0.00 


c 




ATOM 


293 


CB 


PRO A 


43 


8 . 274 


22 .244 


15. 649 


0. 00 


c 




ATOM 


294 


C 


PRO A 


43 


9 . 697 


22 .416 


13 .612 


0.00 


c 




ATOM 


295 


O 


PRO A 


43 


10. 816 


22 .920 


13 . 660 


0.00 


0 


55 


ATOM 


296 


CD 


PRO A 


43 


10. 319 


21.934 


16 . 809 


0.00 


c 




ATOM 


297 


CG 


PRO A 


43 


9 . 278 


23 . 008 


16 .480 


0 . 00 


c 




ATOM 


298 


N 


THR A 


44 


8. 841 


22.652 


12.621 


0.00 


N 




ATOM 


299 


CA 


THR A 


44 


9.208 


23 .533 


11 .522 


0.00 


c 




ATOM 


300 


CB 


THR A 


44 


8.225 


23 .421 


10.345 


0. 00 


c 


60 


ATOM 


301 


C 


THR A 


44 


9.142 


24 .934 


12 .110 


0. 00 


c 




ATOM 


302 


0 


THR A 


44 


8.162 


25.293 


12 .772 


0. 00 


0 




ATOM 


303 


OG1 


THR A 


44 


8.437 


22 .176 


9 . 671 


0. 00 


o 




ATOM 


304 


CG2 


THR A 


44 


8.423 


24 .566 


9 .366 


0 . 00 


c 




ATOM 


305 


N 


GLY A 


45 


10.196 


25.710 


11.893 


0. 00 


N 


65 


ATOM 


306 


CA 


GLY A 


45 


10.233 


27.057 


12 .425 


0. 00 


c 




ATOM 


/ 


C 


GLY A 




11 A 01 

XX. . *±&x 


•3*7 OC1 
A i . ODX 


i 1 oon 
xx . y/u 


n fin 
u . uu 


c 




ATOM 


308 


O 


GLY A 


45 


12.226 


27.355 


11.120 


0.00 


o 




ATOM 


309 


N 


THR A 


46 


11.537 


29.084 


12.401 


0.00 


N 




ATOM 


310 


CA 


THR A 


46 


12.615 


29.979 


11.998 


0.00 


c 


70 


ATOM 


311 


CB 


THR A 


46 


12.134 


30.919 


10.867 


0.00 


c 




ATOM 


312 


OG1 


THR A 


46 


11.720 


30.132 


9.741 


0.00 


0 




ATOM 


313 


CG2 


THR A 


46 


13.246 


31.872 


10.438 


0.00 


c 




ATOM 


314 


C 


THR A 


46 


13.097 


30.831 


13.171 


0.00 


c 




ATOM 


315 


O 


THR A 


46 


12.287 


31.407 


13.909 


0.00 


0 


75 


ATOM 


316 


N 


PHE A 


47 


- 14.412 


30.903 


13.358 


0.00 


N 




ATOM 


317 


CA 


PHE A 


47 


14.954 


31.702 


14.451 


0.00 


c 




ATOM 


318 


CB 


PHE A 


47 


16.478 


31.585 


14.530 


0.00 


c 




ATOM 


319 


CG 


PHE A 


47 


16.959 


30.410 


15.339 


0.00 


c 
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ATOM 


320 


CD2 


PHE 


A 


47 


17.538 


30.606 


16.590 


o.oo 


C 




ATOM 


321 


CD1 


PHE 


A 


47 . 


16.843 


29.115 


14.857 


o.oo 


C 




ATOM 


322 


CE2 


PHE 


A 


47 


17.996 


29.532 


17.345 


o.oo 


C 




ATOM 


323 


CE1 


PHE 


A 


47 


17.300 


28.030 


15.608 


6.00 


C 


5 


ATOM 


324 


cz 


PHE 


A 


47 


17.878 


28.241 


16.855 


0.00 


c 




ATOM 


325 


c 


PHE 


A 


47 


14.567 


33.160 


14.226 


0.00 


c 




ATOM 


326 


0 


PHE 


A 


47 


14.665 


33.686 


13.111 


0.00 


0 




ATOM 


327 


N 


ALA 


A 


48 


14.102 


33-795 


15.291 


0.00 


N 




ATOM 


328 


CA 


ALA 


A 


48 


13.690 


35.184 


15.245 


0.00 


c 


10 


ATOM 


329 


CB 


ALA 


A 


48 


12.161 


35.280 


15.133 


0.00 


c 




ATOM 


330 


C 


ALA 


A 


48 


14.174 


35.828 


16.532 


0.00 


c 




ATOM 


331 


O 


ALA 


A 


48 


13.389 


36.116 


17.433 


0.00 


0 




ATOM 


332 


N 


GLY 


A 


49 


15.481 


36.038 


16.609. 


0.00 


N 




ATOM 


333 


CA 


GLY 


A 


49 


16.072 


36.635 


17.791 


0.00 


c 


15 


ATOM 


334 


C 


GLY 


A 


49 


17.068 


35.674 


18.415 


0.00 


c 




ATOM 


335 


0 


GLY 


A 


49 


16.698 


34.589 


18.867 


0.00 


o 




ATOM 


336 


N 


SER 


A 


50 


18.333 


36.073 


18.438 


0.00 


N 




ATOM 


337 


CA 


SER 


A 


50 


19.387 


35.248 


18.999 


0.00 


c 




ATOM 


338 


CB 


SER 


A 


50 


19.976 


34.360 


17.899 


0.00 


c 


20 


ATOM 


339 


OG 


SER 


A 


50 


21.019 


33.552 


18.406 


0.00 


0 




ATOM 


340 


C 


SER 


A 


50 


20.484 


36.112 


19.633 


0.00 


c 




ATOM 


341 


0 


SER 


A 


50 


20.999 


37.045 


19.012 


0.00 


0 




ATOM 


342 


N 


SER 


A 


51 


20.832 


35.794 


20.877 


0.00 


N 




ATOM 


343 


CA 


SER A 


51 


21.860 


36.529 


21.603 


0.00 


c 


25 


ATOM 


344 


CB 


SER 


A 


51 


21.228 


37.337 


22.741 ' 


0.00 


c 




ATOM 


345 


OG 


SER 


A 


51 


22.179 


38.189 


23.359 


0.00 


0 




ATOM 


346 


C 


SER 


A 


51 


22.938 


35.596 


22.162 


0.00 


c 




ATOM 


347 


0 


SER 


A 


51 


22.700 


34.819 


23.089 


0.00 


0 




ATOM 


348 


N 


PHE 


A 


52 


24.127 


35.692 


21.579 


0.00 


N 


30 


ATOM 


349 


CA 


PHE 


A 


52 


25.277 


34.889 


21.970 


0.00 


c 




ATOM 


350 


CB 


PHE 


A 


52 


25.031 


33.414 


21.643 


0.00 


c 




ATOM 


351 


CG 


PHE 


A 


52 


26.204 


32.518 


21.941 


0.00 


c 




ATOM 


352 


CD1 


PHE 


A 


52 


26.485 


32.124 


23.238 


0.00 


c 




ATOM 


353 


CD2 


PHE 


A 


52 


27.034 


32.081 


20.922 


0.00 


c 


35 


ATOM 


354 


CE1 


PHE 


A 


52 


27.575 


31.312 


23.516 


0.00 


c 




ATOM 


355 


CE2 


PHE 


A 


52 


28.131 


31.266 


21.193 


0.00 


c 




ATOM 


356 


CZ 


PHE 


A 


52 


28.400 


30.883 


22.492 


0.00 


c 




ATOM 


357 


C 


PHE 


A 


52 


26.468 


35.390 


21.167 


0.00 


c 




ATOM 


358 


o 


PHE 


A 


52 


26.370 


35.589 


19.960 


0.00 


0 


40 


ATOM 


359 


N 


PRO 


A 


53 


27.612 


35.603 


21.827 


0.00 


N 




ATOM 


360 


CD 


PRO 


A 


53 


28.893 


35.756 


21.110 


0.00 


c 




ATOM 


361 


CA 


PRO 


A 


53 


27.831 


35.405 


23.266 


0.00 


c 




ATOM 


362 


CB 


PRO 


A 


53 


29.351 


35.249 


23.361 


0.00 


c 




ATOM 


363 


CG 


PRO 


A 


53 


29.851 


36.088 


22.223 


0.00 


c 


45 


ATOM 


364 


C 


PRO 


A 


53 


27.268 


36.543 


24.132 


0.00 


c 




ATOM 


365 


O 


PRO 


A 


53 


26.346 


37.235 


23.713 


0.00 


0 




ATOM 


366 


N 


GLY 


A 


54 


27.814 


36.744 


25.328 


0.00 


N 




ATOM 


367 


CA 


GLY 


A 


54 


27.288 


37.777 


26.211 


0.00 


c 




ATOM 


368 


C 


GLY 


A 


54 


26.143 


37.138 


26.980 


0.00 


c 


50 


ATOM 


369 


O . 


GLY 


A 


54 


26.210 


36.964 


28.197 


0.00 


o 




ATOM 


" 370 


N 


ASN A 


55 


25.079 


36.806 


26.254 


0.00 


N 




ATOM 


371 


CA 


ASN 


A 


55 


23.922 


36.103. 


26.810 


0.00 


c 




ATOM 


372 


CB 


ASN 


A 


55 


22.579 


36.740 


26.404 


0.00 


c 




ATOM 


373 


CG 


ASN 


A 


55 


22.516 


38.240 


26.641 


0.00 


c 


55 


ATOM 


374 


ODl 


ASN 


A 


55 


22.161 


39.005 


25.734 


0.00 


o 




ATOM 


375 


ND2 


ASN 


A 


55 


22.833 


38.667 


27.857 


0.00 


N 




ATOM 


376 


C 


ASN 


A 


55 


24.011 


34.788 


26.037 


0.00 


c 




ATOM 


377 


0 


ASN 


A 


55 


24.998 


34.538 


25.333 


0.00 


0 




ATOM 


378 


N 


ASP 


A 


56 


22.980 


33.958 


26.171 


0.00 


N 


60 


ATOM 


379 


CA 


ASP 


A 


56 


22.917 


32.682 


25.473 


0.00 


c 




ATOM 


380 


CB 


ASP 


A 


56 


23.774 


31.595 


26.119 


0.00 


c 




ATOM 


381 


CG 


ASP 


A 


56 


23.987 


30.395 


25.179 


0.00 


c 




ATOM 


382 


ODl 


ASP 


A 


56 


24.631 


29.408 


25.585 


0.00 


0 




ATOM 


383 


0D2 


ASP 


A 


56 


23.504 


30.443 


24.024 


0.00 


01 


65 


ATOM 


384 


C 


ASP 


A 


56 


21.470 


32.221 


25.379 


0.00 


c 




ATOM 


385 


0 


ASP 


A 


56 


21 .078 


31.195 


25.930 


0.00 


o 




ATOM 


386 


N 


TYR 


A 


57 


20.672 


33.008 


24.671 


0.00 


N 




ATOM 


387 


CA 


TYR 


A 


57 


19.266 


32.693 


24.485 


0.00 


c 




ATOM 


388 


CB 


TYR 


A 


57 


18.396 


33.484 


25.463 


0.00 


c 


70 


ATOM 


389 


CG 


TYR 


A 


57 


18.527 


34.993 


25.374 


0.00 


c 




ATOM 


390 


CD1. 


TYR 


A 


57 


19.153 


35.711 


26.390 


0.00 


c 




ATOM 


391 


CE1 


TYR 


A 


57 


19.231 


37.092 


26.352 


0.00 


c 




ATOM 


392 


CD2 


TYR 


A 


57 


17.986 


35.706 


24.303 


0.00 


c 




ATOM 


393 


CE2 


TYR 


A 


57 


18.060 


37.093 


24.255 


0.00 


c 


75 


ATOM 


394 


CZ 


TYR 


A 


57 


18.682 


37.781 


25.289 


0.00 


c 




ATOM 


395 


OH 


TYR 


A 


57 


18.732 


39.165 


25.286 


0.00 


0 




ATOM 


396 


C 


TYR 


A 


57 


18.820 


32.998 


23.062 


0.00 


c 




ATOM 


397 


0 


TYR A 


57 


19.438 


33.800 


22.355 


0.00 


0 
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ATOM 


398 


N 


ALA 


A 


58 


17.742 


32.344 


22.652 


0.00 


N 




ATOM 


399 


CA 


ALA 


A 


58 


17.187 


32.532 


21.323 


0.00 


c 




ATOM 


400 


CB 


ALA 


A 


58 


17.899 


31.645 


20.312 


0.00 


c 




ATOM 


401 


C 


ALA 


A 


58 


15.706 


32.191 


21.360 


0.00 


c 


5 


ATOM 


402 


0 


ALA 


A 


58 


15.228 


31.521 


22.284 


0.00 


0 




ATOM 


403 


N 


PHE 


A 


59 


14.989 


32.683 


20.359 


0.00 


N 




ATOM 


404 


CA 


PHE 


A 


59 


13.564 


32.453 


20.225 


0.00 


c 




ATOM 


405 


CB 


PHE 


A 


59 


12.762 


33.735 


20.438 


0.00 


c 




ATOM 


406 


CG 


PHE 


A 


59 


11.333 


33.629 


19.970 


0.00 


c 


10 


ATOM 


407 


CD2 


PHE 


A 


59 


10.859 


34.437 


18.947 


0.00 


c 




ATOM 


408 


CD1 


PHE 


A 


59 


10.475 


32.698 


20.531 


0.00 


c 




ATOM 


409 


CE2 


PHE 


A 


59 


9.553 


34.316 


18.491 


0.00 


c 




ATOM 


410 


CE1 


PHE 


A 


59 


9.175 


32.573 


20.084 


0.00 


c 




ATOM 


411 


CZ 


PHE 


A 


59 


8.712 


33.382 


19.063 


0.00 


c 


15 


ATOM 


412 


C 


PHE 


A 


59 


13.294 


31.942 


18.816 


0.00 


c 




ATOM 


413 


O 


PHE 


A 


59 


13.693. 


32.562 


17.820 


0.00 


0 




ATOM 


414 


N 


VAL 


A 


60 


12.616 


30.809 


18.731 


0.00 


N 




ATOM 


415 


CA 


VAL 


A 


60 


12.308 


30.253 


17.434 


0.00 


c 




ATOM 


416 


CB 


VAL 


A 


60 


12.702 


28.776 


17.340 


0.00 


c 


20 


ATOM 


417 


CGI 


VAL 


A 


60 


12.503 


28.279 


15.908 


0.00 


c 




ATOM 


418 


CG2 


VAL 


A 


60 


14.147 


28.593 


17.796 


0.00 


c 




ATOM 


419 


C 


VAL 


A 


60 


10.816 


30.361 


17.236 


0.00 


c 




ATOM 


420 


0 


VAL 


A 


60 


10.043 


29.927 


18.087 


0.00 


0 




ATOM 


421 


N 


ARG 


A 


61 


10.406 


30.960 


16.126 


0.00 


N 


25 


ATOM 


422 


CA 


ARG 


A 


61 


8.987 


31.098 


15.851 


0.00 


c 




ATOM 


423 


CB 


ARG 


A 


61 


8.704 


32.313 


14.962 


0.00 


c 




ATOM 


424 


CG 


ARG 


A 


61 


7.255 


32.374 


14.480 


0.00 


c 




ATOM 


425 


CD 


ARG 


A 


61 


7.019 


33.543 


13 .521 


0.00 


c 




ATOM 


426 


NE 


ARG 


A 


61 


5.615 


33.660 


13.118 


0.00 


N1 + 


30 


ATOM 


427 


CZ 


ARG 


A 


61 


4.989 


32.815 


12.303 


0.00 


c 




ATOM 


428 


NH2 


ARG 


A 


61 


3.711 


33.007 


12.004 


0.00 


N 




ATOM 


429 


NH1 


ARG 


A 


61 


5.636 


31.777 


11.787 


0.00 


N 




ATOM 


430 


C 


ARG 


A 


61 


8.509 


29.847 


15.128 


0.00 


c 




ATOM 


431 


0 


ARG 


A 


61 


9.193 


29.338 


14.238 


0.00 


0 


35 


ATOM 


432 


N 


THR 


A 


62 


7.338 


29.357 


15.527 


0.00 


N 




ATOM 


433 


CA 


THR 


A 


62 


6.740 


28.170 


14.923 


0.00 


c 




ATOM 


434 


CB 


THR 


A 


62 


6.514 


27.046 


15.956 


0.00 


c 




ATOM 


435 


OG1 


THR 


A 


62 


5.808 


27.570 


17.089 


0.00 


0 




ATOM 


436 


CG2 


THR 


A 


62 


7.845 


26.460 


16.396 


0.00 


c 


40 


ATOM 


437 


C 


THR 


A 


62 


5.391 


28.597 


14.352 


0.00 


c 




ATOM 


438 


O 


THR 


A 


62 


4.857 


29.645 


14.724 


0.00 


0 




ATOM 


439 


N 


GLY 


A 


63 


4.837 


27.791 


13.455 


0.00 


N 




ATOM 


440 


CA 


GLY 


A 


63 


3.562 


28.146 


12.859 


0.00 


c 




ATOM 


441 


C 


GLY 


A 


63 


2.522 


27.046 


12.880 


0.00 


c 


45 


ATOM 


442 


o 


GLY 


A 


63 


2.375 


26.326 


13.873 


0.00 


0 




ATOM 


443 


N 


ALA 


A 


64 


1.806 


26.909 


11.767 


0.00 


N 




ATOM 


444 


CA 


ALA 


A 


64 


0.744 


25.916 


11.643 


0.00 


c 




ATOM 


445 


C 


ALA 


A 


64 


1.213 


24.496 


11.895 


0.00 


c 




ATOM 


446 


O 


ALA 


A 


64 


2.370 


24.154 


11.651 


0.00 


0 


50 


ATOM 


447 


CB 


ALA 


A 


64 


0.111 


26.009 


10.268 


0.00 


c 




ATOM 


448 


N 


GLY 


A 


65 


0.291 


23.672 


12.381 


0.00 


N 




ATOM 


449 


CA 


GLY 


A 


65 


0.596 


22.281 


12.657 


0.00 


c 




ATOM 


450 


C 


GLY 


A 


65 


1.469 


22.050 


13.877 


0.00 


c 




ATOM 


451 


O 


GLY 


A 


65 


1.797 


20.908 


14 .199 


0.00 


0 


55 


ATOM 


452 


N 


VAL 


A 


66 


1.837 


23.119 


14.572 


0.00 


N 




ATOM 


453 


CA 


VAL 


A 


66 


2.699 


22.976 


15.736 


0.00 


c 




ATOM 


454 


CB 


VAL 


A 


66 


3.946 


23.854 


15.595 


0.00 


c 




ATOM 


455 


C 


VAL 


A 


66 


2.031 


23.307 


17.063 


0.00 


c 




ATOM 


456 


O 


VAL 


A 


66 


1.737 


24.467 


17.337 


o.oo 


0 


60 


ATOM 


457 


CGI 


VAL 


A 


66 


4.832 


23.683 


16.818 


0.00 


c 




ATOM 


458 


CG2 


VAL 


A 


66 


4.698 


23.482 


14.324 


0.00 


c 




ATOM 


459 


N 


ASN 


A 


67 


1.806 


22.283 


17.882 


0.00 


N 




ATOM 


460 


CA 


ASN 


A 


67 


1.176 


22.454 


19.185 


0.00 


c 




ATOM 


461 


CB 


ASN 


A 


67 


0.403 


21.188 


19.564 


0.00 


c 


65 


ATOM 


462 


C 


ASN 


A 


67 


2.240 


22.745 


20.237 


0.00 


c 




ATOM 


463 


0 


ASN 


A 


67 


3 . 120 


21 . 920 


20 .491 


0 . 00 


0 




ATOM 


464 


CG 


ASN 


A 


67 


-0.405 


20.634 


18.404 


0.00 


c 




ATOM 


465 


OD1 


ASN 


A 


67 


-1.160 


21.361 


17.750 


0.00 


0 




ATOM 


466 


ND2 


ASN 


A 


67 


-0.253 


19.340 


18.140 


0.00 


N 


70 


ATOM 


467 


N 


LEU 


A 


68 


2.148 


23.923 


20.845 


0.00 


_N 




ATOM 


468 


CA 


LEU 


A 


68 


3.087 


24.366 


21.876 


0.00 


c 




ATOM 


469 


CB 


LEU 


A 


68 


3.279 


25.883 


21.759 


0.00 


c 




ATOM 


470 


C 


LEU 


A 


68 


2.571 


23.996 


23.273 


0.00 


c 




ATOM 


471 


0 


LEU 


A 


68 


1.620 


24.597 


23.770 


0.00 


0 


75 


ATOM 


472 


CG 


LEU 


A 


68 


3.688 


26.430 


20.380 


0.00 


c 




ATOM 


473 


CD1 


LEU 


A 


68 


3.724 


27.950 


20.406 


0.00 


c 




ATOM 


474 


CD2 


LEU 


A 


68 


5.051 


25.888 


19.987 


0.00 


c 




ATOM 


475 


N 


LEU 


A 


69 


3.218 


23.027 


23.917 


0.00 


N 
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ATOM 


476 


CA 


LEU A 


69 


2 . 803 


22 .584 


25 . 250 


a a a 

u . uu 


c 




ATOM 


477 


CB 


LEU A 


69 


2 . 769 


21 . 052 


25 .286 


U . UU 


c 




ATOM 


478 


CG 


LEU A 


69 


2 . 045 


20 .369 


24 . 116 


a a a 
U . UU 


c 




ATOM 


479 


CD1 


LEU A 


69 


2 . 109 


18 . 857 


24 . 274 


0 . 00 


c 


5 


ATOM 


480 


CD2 


LEU A 


69 


0 . 604 


20. 841 


24 . 050 


0 . 00 


c 




ATOM 


481 


C 


LEU A 


69 


3 . 612 


23 . 078 


26 . 449 


0 . 00 


c 




ATOM 


482 


O 


LEU A 


69 


4 . 835 


23 . 216 


26.394 


0 . 00 


0 




ATOM 


483 


N 


ALA A 


70 


2 . 907 


23 . 332 


27 . 544 


0 . 00 


N 




ATOM 


484 


CA 


ALA A 


70 


3 . 534 


23 . 796 


28 . 773 


0 . 00 


c 


10 


ATOM 


485 


CB 


ALA A 


70 


2 . 507 


24 . 496 


29 . 646 


0 . 00 


c 




ATOM 


486 


C 


ALA A 


70 


4 . 048 


22 . 539 


29 .473 


0 . 00 


c 




ATOM 


487 


O 


ALA A 


70 


3 .712 


22 . 273 


30. 618 


0 . 00 


0 




ATOM 


488 


N 


GLN A 


71 


4 . 862 


21 . 763 


28.770 


0 . 00 


N 




ATOM 


489 


CA 


GLN A 


71 


5.408 


20. 536 


29 .325 


0 . 00 


c 


15 


ATOM 


490 


CB 


GLN A 


71 


4 . 618 


19 . 333 


28. 808 


0 . 00 


c 




ATOM 


491 


CG 


GLN A 


71 


3 .169 


19 . 299 


29 .255 


0 . oo 


c 




ATOM 


492 


CD 


GLN A 


71 


2 .407 


18 . 116 


28 . 692 


0 . 00 


c 




ATOM 


493 


OE1 


GLN A 


71 


1 . 460 


17 . 634 


29 .308 


0 . 00 


0 




ATOM 


494 


NE2 


GLN A 


71 


2 . 809 


17 . 646 


27 .515 


0 . 00 


N 


20 


ATOM 


495 


C 


GLN A 


71 


6 . 869 


20 . 310 


28 .998 


0 . 00 


c 




ATOM 


496 


O 


GLN A 


71 


7 .395 


20. 825 


28. 009 


0. 00 


o 




ATOM 


497 


N 


VAL A 


72 


7 .520 


19 . 529 


29 . 850 


0 . 00 


N 




ATOM 


498 


CA 


VAL A 


72 


8 . 924 


19 . 199 


29 . 676 


0 . 00 


c 




ATOM 


499 


CB 


VAL A 


72 


9 . 809 


19. 799 


30. 777 


0. 00 


c 


25 


ATOM 


500 


CGI 


VAL A 


72 


11 .240 


19. 342 


30.580 


0. 00 


c 




ATOM 


501 


CG2 


VAL A 


72 


9 . 726 


21 .309 


30.758 


0. 00 


c 




ATOM 


502 


C 


VAL A 


72 


8 . 997 


17 . 685 


29 .772 


0 . 00 


c 




ATOM 


503 


O 


VAL A 


72 


8 . 419 


17 . 086 


30 . 680 


0 . 00 


0 




ATOM 


504 


N 


ASN A 


73 


9 . 699 


17 . 075 


28 . 824 


0 .00 


N 


30 


ATOM 


505 


CA 


ASN A 


73 


9 . 867 


15. 629 


28.771 


0.00 


c 




ATOM 


506 


CB 


ASN A 


73 


10.543 


15 . 250 


27 .452 


0.00 


c 




ATOM 


507 


CG 


ASN A 


73 


10.513 


13 .756 


27.182 


0.00 


c 




ATOM 


508 


OD1 


ASN A 


73 


10 .470 


12 . 947 


28.106 


0 . 00 


0 




ATOM 


509 


ND2 


ASN A 


73 


10 .551 


13 .387 


25.906 


0.00 


N 


35 


ATOM 


510 


C 


ASN A 


73 


10.735 


15 .146 


29.931 


0. 00 


c 




ATOM 


511 


O 


ASN A 


73 


11 . 843 


15 . 651 


30 .123 


0. 00 


0 




ATOM 


512 


N 


ASN A 


74 


10 .244 


14.175 


30 .703 


0. 00 


N 




ATOM 


513 


CA 


ASN A 


74 


11 . 028 


13 . 663 


31 . 823 


0 . 00 


c 




ATOM 


514 


CB 


ASN A 


74 


10 .151 


13 . 368 


33 . 049 


0 . 00 


c 


40 


ATOM 


515 


CG 


ASN A 


74 


9 .191 


12 . 217 


32 . 830 


0 . 00 


c 




ATOM 


516 


OD1 


ASN A 


74 


9 .486 


11. 265 


32 .108 


0. 00 


o 




ATOM 


517 


ND2 


ASN A 


74 


8 . 032 


12. 291 


33 .477 


0 . 00 


N 




ATOM 


518 


C 


ASN A 


74 


11 .791 


12 .408 


31 .417 


0 . 00 


c 




ATOM 


519 


O 


ASN A 


74 


12 .332 


11 . 695 


32.266 


0 . 00 


o 


45 


ATOM 


520 


N 


TYR A 


75 


11 . 830 


12 . 156 


30 .112 


0 . 00 


N 




ATOM 


521 


CA 


TYR A 


75 


12 . 514 


11 . 005 


29 . 528 


0 . 00 


c 




ATOM 


522 


CB 


TYR A 


75 


14 . 008 


11 .321 


29 . 354 


0 . 00 


c 




ATOM 


523 


CG 


TYR A 


75 


14 .268 


12 . 239 


28 .181 


0 . 00 


c 




ATOM 


524 


CDl 


TYR A 


75 


14 . 228 


11 . 756 


26 . 873 


0 . 00 


c 


50 


ATOM 


525 


CE1 


TYR A 


75 


14 .371 


12 . 597 


25 . 792 


0 . 00 


c 




ATOM 


526 


CD 2 


TYR A 


75 


14 . 466 


13 . 599 


28.370 


0 . 00 


c 




ATOM 


527 


CE2 


TYR A 


75 


14 . 608 


14 . 451 


27 . 290 


0 . 00 


c 




ATOM 


528 


cz 


TYR A 


75 


14 . 557 


13 . 945 


26 . 005 


0. 00 


c 




ATOM 


coo 


OH 


TYR A 


75 


14 . 679 


14 . 796 


24 . 931 


0 . 00 


o 


55 


ATOM 


530 


c 


TYR A 


75 


12 . 326 


9 . 680 


30.260 


0 . 00 


c 




ATOM 


531 


O 


TYR A 


75 


13 . 253 


8 . 875 


30 . 378 


0 . 00 


o 




ATOM 


cn 
D 


N 


SER A 


76 


11 . 112 


9 . 464 


30 . 747 


0 . 00 


N 




ATOM 


CIO 
Z3 3 J 


CA 


SER A 


76 


10 . 773 


O O A A 

o . 244 


"3 1 A CO 

31 . 45o 
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34.238 


0.00 


c 
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ATOM 


866 


0 


GLY 


A 


124 


22.910 


34 . 603 


33 . 936 


0.00 


0 




ATOM 


867 


N 


LEU 


A 


125 


24.891 


33 . 666 


34.415 


0.00 


N 




ATOM 


868 


CA 


LEU 


A 


125 


24.412 


32.304 


34.238 


0.00 


C 




ATOM 


869 


CB 


LEU 


A 


125 


25.411 


31.495 


33 .406 


0.00 


C 


5 


ATOM 


870 


CG 


LEU 


A 


125 


25.597 


31-779 


31.913 


0.00 


C 




ATOM 


871 


CD1 


LEU 


A 


125 


26.780 


30.977 


31.383 


0.00 


C 




ATOM 


872 


CD2 


LEU 


A 


125 


24.333 


31.411 


31.152 


0.00 


C 




ATOM 


873 


C 


LEU 


A 


125 


24.189 


31.574 


35.554 


0.00 


C 




ATOM 


874 


0 


LEU 


A 


125 


24.828 


31.869 


36.573 


0.00 


0 


10 


ATOM 


875 


N 


ILE 


A 


126 


23.270 


30.615 


35.516 


0.00 


N 




ATOM 


876 


CA 


ILE 


A 


126 


22.949 


29.813 


36.685 


0.00 


C 




ATOM 


877 


CB 


ILE 


A 


126 


21.506 


29.276 


36.600 


0.00 


C 




ATOM 


878 


CG2 


ILE 


A 


126 


21.268 


28.230 


37.672 


0.00 


C 




ATOM 


879 


CGI 


ILE 


A 


126 


20.517 


30.441 


36.754 


0.00 


c 


15 


ATOM 


880 


CD1 


ILE 


A 


126 


19.074 


30.045 


36.578 


0.00 


c 




ATOM 


881 


C 


ILE 


A 


126 


23.947 


28.646 


36.668 


0.00 


c 




ATOM 


882 


O 


ILE 


A 


126 


24.009 


27.881 


35.701 


0.00 


0 




ATOM 


883 


N 


ARG 


A 


127 


24.746 


28.536 


37.723 


0.00 


N 




ATOM 


884 


CA 


ARG 


A 


127 


25.738 


27.473 


37.831 


0.00 


C 


20 


ATOM 


885 


CB 


ARG 


A 


127 


26.989 


28.007 


38.528 


0.00 


c 




ATOM 


886 


CG 


ARG 


A 


127 


28.129 


27.015 


38.679 


0.00 


c 




ATOM 


887 


CD 


ARG 


A 


127 


29.261 


27.678 


39.441 


0.00 


c 




ATOM 


888 


NE 


ARG 


A 


127 


30.312 


26.748 


39.830 


0.00 


N1 + 




ATOM 
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CZ 


ARG 


A 


127 


31.098 


26.112 


38.971 
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A 


127 


32.033 


25.279 


39.417 


0.00 


N 
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ARG 


A 


127 


30.949 


26.310 


37.669 


0.00 
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ATOM 


892 


C 
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A 
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26.328 


38.633 


0.00 
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127 
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26.553 
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0 
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N 
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25.325 


25.103 


38.151 


0.00 


N 
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ATOM 
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CA 


THR 


A 
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24.784 


23.929 


38.828 


0.00 


c 




ATOM 


896 


CB 


THR 


A 


128 


23.447 


23.475 


38.189 


0.00 


c 




ATOM 


897 


OG1 


THR 


A 


128 


23.718 


22.755 


36.977 


0.00 


o 




ATOM 


898 


CG2 


THR 


A 


128 


22.568 


24.674 


37.863 


0.00 


c 




ATOM 


899 


C 


THR 


A 


128 


25.720 


22.729 


38.782 


0.00 


c 


35 


ATOM 


900 


0 


THR 


A 


128 


26.763 


22.759 


38.135 


0.00 


0 




ATOM 


901 


N 


THR 


A 


129 


25.317 


21.667 


39.472 


0.00 


N 




ATOM 


902 


CA 


THR 


A 


129 


26.084 


20.429 


39.533 


0.00 


c 




ATOM 


903 


CB 


THR 


A 


129 


26.055 


19.838 


40.946 


0.00 


c 




ATOM 


904 


OG1 


THR 


A 


129 


24.691 


19.639 


41.355 


0.00 


0 


40 


ATOM 


905 


CG2 


THR 


A 


129 


26.758 


20.779 


41.924 


0.00 


c 




ATOM 


906 


C 


THR 


A 


129 


25.474 


19.411 


38.565 


0.00 


c 




ATOM 


907 


O 


THR 


A 


129 


25.792 


18.227 


38.607 


0.00 


0 




ATOM 


908 


N 


VAL 


A 


130 


24.589 


19.886 


37.696 


0.00 


N 




ATOM 


909 


CA 


YAL 


A 


130 


23.930 


19.027 


36.722 


0.00 


C 


45 


ATOM 


910 


CB 


VAL 


A 


130 


22.663 


19.707 


36.164 


0.00 


c 




ATOM 


911 


CGI 


VAL 


A 


130 


21.972 


18.790 


35.162 


0.00 


c 




ATOM 


912 


CG2 


VAL 


A 


130 


21.715 


20.054 


37.308 


0.00 


c 




ATOM 


913 


C 


VAL 


A 


130 


24.857 


18.691 


35.561 


0.00 


c 




ATOM 


914 


0 


VAL 


A 


130 


25.623 


19.536 


35.109 


0.00 


0 


50 


ATOM 


915 


N 


CYS 


A 


131 


24.790 


17.449 


35.086 


0.00 


N 




ATOM 


916 


CA 


CYS 


A 


131 


25.626 


17.016 


33.975 


0.00 


C 




ATOM 


917 


CB 


CYS 


A 


131 


25.889 


15.507 


34.034 


0.00 


C 




ATOM 


918 


SG 


CYS 


A 


131 


24.399 


14.468 


33 .874 


0.00 


s 




ATOM 


919 


C 


CYS 


A 


131 


24.893 


17.340 


32.690 


0.00 


C 


55 


ATOM 


920 


0 


CYS 


A 


131 


23.670 


17.436 


32.678 


0.00 


0 




ATOM 


921 


N 


ALA 


A 


132 


25.636 


17.514 


31.607 


0.00 


N 




ATOM 


922 


CA 


ALA 


A 


132 


25.020 


17.821 


30.329 


0.00 


C 




ATOM 


923 


CB 


ALA 


A 


132 


24.707 


19.313 


30.237 


0.00 


c 




ATOM 


924 


C 


ALA 


A 


132 


25.920 


17.404 


29.176 


0.00 


c 


60 


ATOM 


925 


O 


ALA 


A 


132 


27.113 


17.139 


29.356 


0.00 


0 




ATOM 


926 


N 


GLU 


A 


133 


25.323 


17.353 


27.992 


0.00 


N 




ATOM 


927 


CA 


GLU 


A 


133 


26.017 


16.981 


26.774 


0.00 


C 




ATOM 


928 


CB 


GLU 


A 


133 


25.434 


15.686 


26.219 


0.00 


C 




ATOM 


929 


CG 


GLU 


A 


133 


26.457 


14.695 


25.730 


0.00 


c 


65 


ATOM 


930 


CD 


GLU 


A 


133 


27.077 


13.909 


26.862 


0. 00 


c 




ATOM 


931 


OE1 


GLU 


A 


133 


27 . 702 


14 .533 


27 . 741 


0 . 00 


01- 




ATOM 


932 


OE2 


GLU 
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133 


26.937 


12.667 


26.871 


0.00 


0 




ATOM 


933 
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GLU 
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133 


25.750 


18.114 


25.792 


0.00 
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934 
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A 


133 


24.778 


18.851 


25.946 


0.00 


0 
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ATOM 


935 


N 


PRO 


A 


134 


26.602 


18.268 


24.769 


0.00 


N 




ATOM 


936 


CA 


PRO 


A 


134 


26.395 


19.343 


23.789 


0.00 


c 




ATOM' 


937 


CB 


PRO 


A 


134 


27.471 


19.059 


22.742 


0.00 


c 




ATOM 


938 


C 


PRO 


A 


134 


24.975 


19.390 


23.185 


0.00 


c 




ATOM 


939 


O 


PRO 


A 


134 


24.331 


20.446 


23.159 


0.00 


0 


75 


ATOM 


940 


CD 


PRO 


A 


134 


• 27.856 


17.539 


24.501 


0.00 


c 




ATOM 


941 


CG 


PRO 


A 


134 


28.586 


18.478 


23.572 


0.00 


c 




ATOM 


942 


N 


GLY 


A 


135 


24.490 


18.250 


22.708 


0.00 


N 




ATOM 


943 


CA 


GLY 


A 


135 


23.167 


18.218 


22.117 


0.00 


c 
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ATOM 


944 


C 


GLY A 


135 


22 . 074 


18. 675 


23 . 063 


0 . 00 


c 




ATOM 


945 


0 


GLY A 


135 


20 . 979 


19 . 053 


22 . 631 


0 . 00 


o 




ATOM 


946 


N 


ASP A 


136 


22.369 


18. 638 


24 . 359 


0 . 00 


N 




ATOM 


947 


CA 


ASP A 


136 


21 . 414 


19.046 


25 .387 


0. 00 


C 


5 


ATOM 


948 


CB 


ASP A 


136 


21. 914 


18.588 


26 . 770 


0 . 00 


c 




ATOM 


949 


C 


ASP A 


136 


21 . 162 


20 . 564 


25 .400 


0 . 00 


C 




ATOM 


950- 


0 


ASP A 


136 


20. 124 


21.024 


25 . 886 


0 . 00 


0 




ATOM 


951 


CG 


ASP A 


136 


21.783 


17 . 075 


26 . 982 


0 . 00 


c 




ATOM 


952 


OD2 


ASP A 


136 


20. 834 


16.471 


26 .436 


0 .00 


Ol- 


10 


ATOM 


953 


OD1 


ASP A 


136 


22. 618 


16.492 


27 . 714 


0.00 


o 




ATOM 


954 


N 


SER A 


137 


22 . 109 


21 . 332 


24 . 868 


0.00 


N 




ATOM 


955 


CA 


SER A 


137 


21. 989 


22.791 


24 . 823 


0 .00 


c 




ATOM 


956 


CB 


SER A 


137 


23 . 048 


23 . 390 


23 .896 


0 .00 


c 




ATOM 


957 


C 


SER A 


137 


20. 610 


23 .287 


24 .388 


0.00 


c 


15 


ATOM 


958 


0 


SER A 


137 


19.993 


22.752 


23 .456 


0.00 


o 




ATOM 


959 


OG 


SER A 


137 


24.352 


23 .234 


24 .427 


0.00 


o 




ATOM 


960 


N 


GLY A 


138 


20. 148 


24 .332 


25 . 070 


0.00 


N 




ATOM 


961 


CA 


GLY A 


138 


18. 854 


24,. 904 


24 .782 


0. 00 


C 




ATOM 


962 


C 


GLY A 


138 


17. 803 


24 .224 


25 . 629 


0.00 


C 


20 


ATOM 


963 


0 


GLY A 


138 


16.706 


24 .748 


25 . 809 


0.00 


o 




ATOM 


964 


N 


GLY A 


139 


18.150 


23 .057 


26 .160 


0. 00 


N 




ATOM 


965 


CA 


GLY A 


139 


17.222 


22 .297 


26 .982 


0. 00 


c 




ATOM 


966 


C 


GLY A 


139 


16.617 


23 .021 


28.176 


0. 00 


c 




ATOM 


967 


0 


GLY A 


139 


17.104 


24.070 


28.604 


0. 00 


o 


25 


ATOM 


968 


N 


SER A 


140 


15.555 


22.438 


28.729 


0. 00 


N 




ATOM 


969 


CA 


SER A 


140 


14. 858 


23.024 


29 . 870 


0. 00 


c 




ATOM 


970 


CB 


SER A 


140 


13 . 423 


22.500 


29.948 


0.00 


c 




ATOM 


971 


OG 


SER A 


140 


12 . 971 


22 .037 


28 . 691 


0.00 


o 




ATOM 


972 


C 


SER A 


140 


15. 532 


22.736 


31.198 


0.00 


C 


30 


ATOM 


973 


0 


SER A 


140 


16. 162 


21.691 


31.389 


0.00 


o 




ATOM 


974 


N 


LEU A 


141 


15.393 


23 .683 


32.115 


0 .00 


N 




ATOM 


975 


CA 


LEU A 


141 


15. 967 


23 .558 


33 .448 


0 .00 


c 




ATOM 


976 


CB 


LEU A 


141 


17.175 


24.482 


33 .639 


0.00 


c 




ATOM 


977 


CG 


LEU A 


141 


17.722 


24.420 


35.073 


0.00 


C 


35 


ATOM 


978 


CD1 


LEU A 


141 


18.323 


23 .047 


35.334 


0. 00 


c 




ATOM 


979 


CD2 


LEU A 


141 


18.749 


25.518 


35.297 


0. 00 


c 




ATOM 


980 


C 


LEU A 


141 


14.851 


23 .945 


34.405 


0. 00 


c 




ATOM 


981 


O 


LEU A 


141 


14.398 


25. 081 


34.422 


0. 00 


o 




ATOM 


982 


N 


LEU A 


142 


14.409 


22 .987 


35.199 


0. 00 


N 


40 


ATOM 


983 


CA 


LEU A 


142 


13 .341 


23 .220 


36.150 


0. 00 


c 




ATOM 


984 


CB 


LEU A 


142 


12.230 


22.198 


35.913 


0. 00 


c 




ATOM 


985 


CG 


LEU A 


142 


11 .289 


22.306 


34 .719 


0.00 


c 




ATOM 


986 


CD1 


LEU A 


142 


10.674 


20.933 


34 .463 


0.00 


c 




ATOM 


987 


CD2 


LEU A 


142 


10.219 


23 .350 


34 .996 


0.00 


c 


45 


ATOM 


988 


C 


LEU A 


142 


13 .702 


23 .168 


37.629 


0 . 00 


c 




ATOM 


989 


0 


LEU A 


142 


14 .745 


22.671 . 


38.029 


0. 00 


o 




ATOM 


990 


N 


ALA A 


143 


12 .788 


23 .701 


38.424 


0. 00 


N 




ATOM 


991 


CA 


ALA A 


143 


12 . 880 


23 .759 


39 . 875 


0 . 00 


C 




ATOM 


992 


CB 


ALA A 


143 


13 .159 


25.178 


40. 345 


0 . 00 


C 


50 


ATOM 


993 


C 


ALA A 


143 


11 . 434 


23 .368 


40.165 


0.00 


c 




ATOM 


994 


0 


ALA A 


143 


10.557 


.24 .221 


40.225 


0. 00 


o 




ATOM 


995 


N 


GLY A 


144 


11 .175 


22 .072 


40.287 


0 . 00 


N 




ATOM 


996 


CA 


GLY A 


144 


9 . 810 


21 . 642 


40.513 


0 . 00 


c 




ATOM 


997 


C 


GLY A 


144 


9 . 058 


21 . 945 


39. 232 


0. 00 


C 


55 


ATOM 


998 


0 


GLY A 


144 


9 .457 


21 .487 


38. 154 


0 . 00 


o 




ATOM 


999 


N 


ASN A 


145 


7 . 984 


22 . 723 


39.322 


0. 00 


N 




ATOM 


1000 


CA 


ASN A 


145 


7 . 241 


23 . 066 


38. 122 


0 . 00 


C 




ATOM 


1001 


CB 


ASN A 


145 


5.736 


22 . 848 


38.321 


0. 00 


C 




ATOM 


1002 


CG 


ASN A 


145 


5. 144 


23 . 751 


39 .384 


0. 00 


C 


60 


ATOM 


1003 


OD1 


ASN A 


145 


5.382 


24 . 962 


39 . 396 


0. 00 


O 




ATOM 


1004 


ND2 


ASN A 


145 


4 .351 


23 .166 


40 .281 


0. 00 


N 




ATOM 


1005 


C 


ASN A 


145 


7 . 503. 


24 .496 


37.650 


0. 00 


C 




ATOM 


1006 


0 


ASN A 


145 


6 . 716 


25 . 049 


36 . 886 


0 . 00 


O 




ATOM 


1007 


N 


GLN A 


146 


8 . 613 


25. 086 


38. 093 


0 . 00 


N 


65 


ATOM 


1008 


CA 


GLN A 


146 


8 .968 


26 .455 


37 .702 


0 . 00 


c 




ATOM 






GLN A 










u . uu 


/-I 

c 




ATOM 


1010 


CG 


GLN A 


146 


8.080 


27.367 


39.916 


0.00 


c 




ATOM 


1011 


CD 


GLN A 


146 


6.875 


28.097 


39.363 


0.00 


c 




ATOM 


1012 


OE1 


GLN A 


146 


5.735 


27.705 


39.615 


0.00 


o 


70 


ATOM 


1013 


NE2 


GLN A 


146 


7.117 


29.172 


38.617 


0.00 


N 




ATOM 


1014 


C 


GLN A 


146 


10.205 


26.492 


36.798 


0.00 


c 




ATOM 


1015 


0 


GLN A 


146 


11.277 


25.999 


37.169 


0.00 


O 




ATOM 


1016 


N 


ALA A 


147 


10.055 


27.084 


35.618 


0.00 


N 




ATOM 


1017 


CA 


ALA A 


147 


11.160 


27.188 


34.660 


0.00 


c 


75 


ATOM 


1018 


CB 


ALA A 


147 


10.642 


27.698 


33.309 


0.00 


c 




ATOM 


1019 


C 


ALA A 


147 


12.253 


28.124 


35.183 


0.00 


c 




ATOM 


1020 


0 


ALA A 


147 


11.958 


29.233 


35.625 


0.00 


O 




ATOM 


1021 


N 


GLN A 


148 


13.508 


27.679 


35.124 


0.00 


N 
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0 . 00 
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ATOM 


1197 


CG 


LEU 


A 


173 


10.263 


34.645 


29.486 


0.00 


c 




ATOM 


1198 


CD1 


LEU 


A 


173 


9.465 


34.308 


30.759 


0.00 


c 




ATOM 


1199 


CD 2 


LEU 


A 


173 


9.671 


33.965 


28.251 


0.00 


c 




ATOM 


1200 


C 


LEU 


A 


173 


11.031 


38.105 


27.869 


0.00 


c 




ATOM 


1201 


0 


LEU 


A 


173 


10.287 


38.617 


27.037 


0.00 


O 


25 


ATOM 


1202 


N 


GLN 


A 


174 


11.831 


38.807 


28.662 


0.00 


N 




ATOM 


1203 


CA 


GLN 


A 


174 


11.896 


40.259 


28.641 


0.00 


c 




ATOM 


1204 


CB 


GLN 


A 


174 


12.665 


40.752 


29.870 


0.00 


c 




ATOM 


1205 


CG 


GLN 


A 


174 


12.868 


42.259 


29.923 


0.00 


c 




ATOM 


1206 


CD 


GLN 


A 


174 


11.664 


43.006 


30.461 


0.00 


c 


30 


ATOM 


1207 


OE1 


GLN 


A 


174 


10.532 


42.811 


30.003 


0.00 


O 




ATOM 


1208 


NE2 


GLN 


A 


174 


11.904 


43.876 


31.438 


0.00 


N 




ATOM 


1209 


C 


GLN 


A 


174 


12.555 


40.817 


27.381 


0.00 


c 




ATOM 


1210 


O 


GLN 


A 


174 


12.219 


41.909 


26.933 


0.00 


O 




ATOM 


1211 


N 


ALA 


A 


175 


13.493 


40.078 


26.808 


0.00 


N 


35 


ATOM 


1212 


CA 


ALA 


A 


175 


14.164- 


40.552 


25.604 


0.00 


c 




ATOM 


1213 


CB 


ALA 


A 


175 


15.378 


39.681 


25.306 


0.00 


c 




ATOM 


1214 


C 


ALA 


A 


175 


13.238 


40.580 


24.394 


0.00 


c 




ATOM 


1215 


0 


ALA 


A 


175 


13.276 


41.512 


23.595 


0.00 


O 




ATOM 


1216 


N 


TYR 


A 


176 


12.396 


39.561 


24.276 


0.00 


• N 


40 


ATOM 


1217 


CA 


TYR 


A 


176 


11.462 


39.458 


23.161 


0.00 


c 




ATOM 


1218 


CB 


TYR 


A 


176 


11.571. 


38.063 


22.535 


0.00 


c 




ATOM 


1219 


CG 


TYR 


A 


176 • 


12.990 


37.700 


22.173 


0.00 


c 




ATOM 


1220 


CD1 


TYR 


A 


176 


13.761 


38.551 


21.381 


0.00 


c 




ATOM 


1221 


CE1 


TYR 


A 


176 


15.075 


38.249 


21.073 


0.00 


c 


45 


ATOM 


1222 


CD2 


TYR 


A 


176 


13.574 


36.528 


22.643 


0.00 


c 




ATOM 


1223 


CE2 


TYR 


A 


176 


14.890 


36.213 


22.335 


0.00 


c 




ATOM 


1224 


CZ 


TYR 


A 


176 


15.636 


37.083 


21.553 


0.00 


c 




ATOM 


1225 


OH 


TYR 


A 


176 


16.959 


36.817 


21.285 


0.00 


O 




ATOM 


1226 


C 


TYR 


A 


176 


10.004 


39.742 


23.500 


0.00 


c 


50 


ATOM 


1227 


0 


TYR 


A 


176 


9.135 


39.574 


22.646 


0.00 


O 




ATOM 


1228 


N 


GLY 


A 


177 


9.736 


40.165 


24.733 


0.00 


N 




ATOM 


1229 


CA 


GLY 


A 


177 


8.366 


40.457 


25.131 


0.00 


c 




ATOM 


1230 


C 


GLY 


A 


177 


7.469 


39.232 


25.065 


0.00 


c 




ATOM 


1231 


0 


GLY 


A 


177 


6.295 


39.326 


24.711 


0.00 


0 


55 


ATOM 


1232 


N 


LEU 


A 


178 


8.033 


38.080 


25.421 


0.00 


N 




ATOM 


1233 


CA 


LEU 


A 


178 


7.323 


36.807 


25.390 


0.00 


c 




ATOM 


1234 


CB 


LEU 


A 


178 


8.275 


35.694 


24.937 


0.00 


c 




ATOM 


1235 


CG 


LEU 


A 


178 


8.981 


35.724 


23.581 


0.00 


c 




ATOM 


1236 


CD1 


LEU 


A 


178 


10.077 


34.688 


23.584 


0.00 


c 


60 


ATOM 


1237 


CD 2 


LEU 


A 


178 


8.006 


35.441 


22.454 


0.00 


c 




ATOM 


1238 


C 


LEU 


A 


178 


6.737 


36.403 


26.741 


0. 00 


c 




ATOM 


1239 


O 


LEU 


A 


178 


7.221 


36.821 


27.794 


0.00 


0 




ATOM 


1240 


N 


ARG 


A 


179 


5.698 


35.573 


26.688 


0. 00 


N 




ATOM 


1241 


CA 


ARG 


A 


179 


5.008 


35.060 


27.875 


0.00 


c 


65 


ATOM 


1242 


CB 


ARG 


A 


179 


3.519 


35.439 


27.872 


0.00 


c 




ATOM 


1243 


CG 


ARG 


A 


179 


3 . 193 


36 . 849 


28.356 


0 . 00 


c 




ATOM 


1244 


CD 


ARG 


A 


179 


1.760 


37.239 


27.989 


0.00 


c 




ATOM 


1245 


NE 


ARG 


A 


179 


1.401 


38.565 


28.490 


0.00 


N1+ 




ATOM 


1246 


CZ 


ARG 


A 


179 


1.070 


38.825 


29.751 


0.00 


c 


70 


ATOM 


1247 


NH1 


ARG 


A 


179 


1.044 


37.844 


30.646 


0.00 


. N 




ATOM 


1248 


NH2 


ARG 


A 


179 


0.773 


40.066 


30.117 


0.00 


N 




ATOM 


1249 


C 


ARG 


A 


179 


5.118 


33.541 


27.794 


0.00 


c 




ATOM 


1250 


0 


ARG 


A 


179 


5.043 


32.978 


26.707 


0.00 


0 




ATOM 


1251 


N 


MET 


A 


180 


5.313 


32.882 


28.931 


0.00 


N ' 


75 


ATOM 


1252 


CA 


MET 


A 


180 


5.422 


31.428 


28.955 


0.00 


c 




ATOM 


1253 


CB 


MET 


A 


180 


5.866 


30.936 


30.329 


0.00 


c 




ATOM 


1254 


CG 


MET 


A 


180 


7.257 


31.311 


30.768 


0.00 


c 




ATOM 


1255 


SD 


MET 


A 


180 


8.400 


30.052 


30.227 


0.00 


s 
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ATOM 


1256 


CE - 


MET A 


180 


7 . 622 


28 . 610 


30. 892 


0 . 00 


c 




ATOM 


1257 


C 


MET A 


180 


4 . 034 


30 . 856 


28. 712 


0 . 00 


c 




ATOM 


1258 


O 


MET A 


180 


3 . 034 


31 . 444 


29.118 


0 . 00 


O 




ATOM 


1259 


N 


ILE A 


181 


3 . 967 


29.715 


28.042 


0 . 00 


N 


5 


ATOM 


1260 


CA 


ILE A 


181 


2 . 690 


29 . 085 


27 .781 


0.00 


c 




ATOM 


1261 


CB 


ILE A 


181 


2.726 


28 .266 


26.464 


0 . 00 


c 




ATOM 


1262- 


CG2 


ILE A 


181 


1.534 


27.294 


26.399 


0.00 


c 




ATOM 


1263 


CGI 


ILE A 


181 


2.711 


29.237 


25.270 


0.00 


c 




ATOM 


1264 


GDI 


ILE A 


181 


2 .740 


28.556 


23 . 910 


0 . 00 


c 


10 


ATOM 


1265 


C 


ILE A 


181 


2. 613 


28.211 


29.023 


0.00 


c 




ATOM 


1266 


O 


ILE A 


181 


3.458 


27 .357 


29.238 


0.00 


0 




ATOM 


1267 


N 


THR A 


182 


1.598 


28.457 


29.845 


0.00 


N 




ATOM 


1268 


CA 


THR A 


182 


1.398 


27.724 


31.090 


0.00 


c 




ATOM 


1269 


C 


THR A 


182 


0.212 


26.770 


31.163 


0.00 


c 


15 


ATOM 


1270 


O 


THR A 


182 


-0. 098 


26.231 


32.241 


0.00 


0 




ATOM 


1271 


CB 


THR A 


182 


1.315 


28.733 


32.273 


0.00 


c 




ATOM 


1272 


OG1 


THR A 


182 


0.199 


28.407 


33.111 


0.00 


0 




ATOM 


1273 


CG2 


THR A 


182 


1.137 


30.133 


31.739 


0.00 


c 




ATOM 


1274 


N 


THR A 


183 


-0.448 


26.534 


30.036 


0.00 


N 


20 


ATOM 


1275 


CA 


THR A 


183 


-1.593 


25.623 


30.045 


0.00 


c 




ATOM 


1276 


C 


THR A 


183 


-1. 754 


25.043 


28.647 


0.00 


c 




ATOM 


1277 


O 


THR A 


183 


-1.274 


25.608 


27.675 


0.00 


0 




ATOM 


1278 


CB 


THR A 


183 


-2.909 


26.342 


30.433 


0.00 


c 




ATOM. 


1279 


OG1 


THR A 


183 


-3.716 


25.460 


31.228 


0.00 


0 


25 


ATOM 


1280 


CG2 


THR A 


183 


-3.690 


26.738 


29.184 


0.00 


c 




ATOM 


1281 


N 


ASP A 


184 


-2.402 


23.896 


28.532 


0.00 


N 




ATOM 


1282 


CA 


ASP A 


184 


-2.573 


23.318 


27.213 


0.00 


c 




ATOM 


1283 


C 


ASP A 


184 


-4.035 


23.091 


26.918 


0.00 


c 




ATOM 


1284 


O 


ASP A 


184 


-4.380 


22.208 


26.174 


0.00 


0 


30 


ATOM 


1285 


CB 


ASP A 


184 


-1.810 


22.005 


27.113 


0.00 


c 




ATOM 


1286 


CG 


ASP A 


184 


-0.464 


22.056 


27.794 


0.00 


c 




ATOM 


1287 


OD1 


ASP A 


184 


0.296 


23.029 


27.577 


0.00 


o 




ATOM 


1288 


OD2 


ASP A 


184 


-0.152 


21.080 


28.527 


0.00 


Ol- 




TER 


1289 




ASP A 


184 












35 


ATOM 


1290 


N 


ALA B 


.14 


37.553 


22.457 


29.194 


0.00 


N1 + 




ATOM 


1291 


H 


ALA B 


14 


36.582 


22.364 


28.935 


0.00 


H 




ATOM 


1292 


H 


ALA B 


14 


37.991 


23.157 


28.614 


0.00 


H 




ATOM 


1293 


H 


ALA B 


14 


38.021 


21.572 


29.065 


0.00 


H 




ATOM 


1294 


CA 


ALA B 


14 


37.649 


22.863 


30.616 


0.00 


c 


40 


ATOM 


1295 


C 


ALA B 


14 


36.345 


22.665 


31.400 


0.00 


c 




ATOM 


1296 


O 


ALA B 


14 


36.364 


21.816 


32.304 


0.00 


o 




ATOM 


1297 


CB 


ALA B 


14 


38.235 


24.270 


30.658 


0.00 


c 




ATOM 


1298 


N 


ALA B 


15 


35.261 


23.393 


31.094 


0.00 


N 




ATOM 


1299 


CA 


ALA B 


15 


35.165 


24.394 


30.026 


0.00 


c 


45 


ATOM 


1300 


C 


ALA B 


15 


34.368 


23.941 


28.790 


0.00 


c 




ATOM 


1301 


O 


ALA B 


15 


34.957 


23 .330 


27. 892 


0.00 


o 




ATOM 


1302 


CB 


ALA B 


15 


34.779 


25.773 


30.573 


0.00 


c 




ATOM 


1303 


N 


ALA B 


16 


33 .028 


24 . 069 


28.763 


0. 00 


N 




ATOM 


1304 


CA 


ALA B 


16 


32.304 


23.388 


27.683 


0.00 


c 


50 


ATOM 


1305 


C 


ALA B 


16 


31.144 


24. 054 


26. 918 


0.00 


c 




ATOM 


1306 


O 


ALA B 


16 


30.114 


24.490 


27.453 


0.00 


o 




ATOM 


1307 


CB 


ALA B 


16 


32.420 


21. 850 


27.713 


0.00 


c 




ATOM 


1308 


H 


ALA B 


16 


32.544 


24 . 608 


29.452 


0. 00 


H 




ATOM 


1309 


N 


HIS B 


17 


31.370 


24. Ill 


25 .600 


0.00 


N 


55 


ATOM 


1310 


CA 


HIS B 


17 


30.508 


24. 676 


24.521 


0.00 


c 




ATOM 


1311 


C 


HIS B 


17 


29.820 


23 . 558 


23 .756 


0.00 


c 




ATOM 


1312 


O 


HIS B 


17 


30.487 


22.621 


23 .291 


0.00 


0 




ATOM 


1313 


CB 


HIS B 


17 


31 .473 


25.545 


23 .683 


0.00 


c 




ATOM 


1314 


CG 


HIS B 


17 


30. 806 


26.351 


22.601 


0.00 


c 


60 ' 


ATOM 


1315 


ND1 


HIS B 


17 


30.728 


26. 028 


21.264 


0.00 


N 




ATOM 


1316 


CD2 


HIS B 


17 


30.170 


27. 551 


22.772 


0.00 


c 




ATOM 


1317 


CE1 


HIS B 


17 


30.054 


27. 014 


20.648 


0.00 


c 




ATOM 


1318 


NE2 


HIS B 


17 


29 .694 


27.965 


21.525 


0. 00 


N 




ATOM 


1319 


H 


HIS B 


17 


32 .233 


23 .710 


25.292 


0. 00 


H 


65 


ATOM 


1320 


N 


TYR B 


18 


28.491 


23 .661 


23 .613 


0. 00 


N 




ATOM 


1321 


CA 


TYR B 


18 


27 . 651 


22 . 538 


23 . 244 


0 . 00 


c 




ATOM 


1322 


C 


TYR B 


18 


26.791 


22.741 


21.978 


0.00 


c 




ATOM 


1323 


O 


TYR B 


18 


25.936 


21.904 


21.762 


0.00 


o 




ATOM 


1324 


CB 


TYR B 


18 


26.869 


22.044 


24.476 


0.00 


c 


70 


ATOM 


1325 


CG 


TYR B 


18 


27.638 


21.257 


25.527 


0.00 


c 




ATOM 


1326 


CD1 


TYR B 


18 


27.073 


20.996 


26.793 


0.00 


c 




ATOM 


1327 


CD2 


TYR B 


18 


28.818 


20.596 


25.160 


0.00 


c 




ATOM 


1328 


CE1 


TYR B 


18 


27.702 


20.099 


27.685 


0.00 


c 




ATOM 


1329 


CE2 


TYR B 


18 


29.420 


19.668 


26.020 


0.00 


c 


75 


ATOM 


1330 


CZ 


TYR B 


18 


28.855 


19.410 


27.276 


0.00 


c 




ATOM 


1331 


OH 


TYR B 


18 


29.519 


18.595 


28.139 


0.00 


0 




ATOM 


1332 


H 


TYR B 


18 


28.022 


24.521 


23.872 


0.00 


H 




ATOM 


1333 


N 


ASP B 


19 


27.328 


23.446 


20.986 


0.00 


N 
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ATOM 


1334 


CA 


ASP 


B 


19 


1*7 OC") 

2, 1 . £ ->Z 


43 . UOD 


19 . 573 


0 . 00 


c 


AiVJfil 


1335 


C 


ASP 


B 


19 


OK QCl 

zo « y-> / 


4£> . jJj 


19 . 178 


0 . 00 


c 


AI\JM 


1336 


O 


ASP 


B 


19 


24 . odd 


->*> DC1 

22 . ool 


19 . 367 


0 . 00 


o 


ATOM 


1337 


CB 


ASP 


B 


19 


nl 1 Q 1 
2 / • Jol 


24 .2/1 


18 . 655 


0 .00 


c 


ATOM 


1338 


CG 


ASP 


B 


19 


O O 'JflO 

2b. jyy 


o C ICO 

2D . Joy 


18 . 926 


0.00 


c 


ATOM 


1339 


ODl 


ASP 


B 


19 


2o .fit 


or ceo 
2d . DOO 


20 . 105 


0.00 


0 


ATOM 


1340 


OD2 


ASP 


B 


19 


28 . 588 


26 . 117 


17 . 941 


0.00 


01 - 


ATOM 


1341 


H 


ASP 


B 


19 


28 . 092 


24 . 050 


21.252 


0. 00 


H 


ATOM 


1342 


N 


GLU 


B 


20 


26 . 024 


21 . 140 


18 . 622 


0 . 00 


N 


ATOM 


1343 


CA 


GLU 


B 


20 


27 . 219 


20 . 341 


18 . 451 


0 . 00 


c 


ATOM 


1344 


C 


GLU 


B 


20 


27 . 84 8 


20 . 634 


17 . 079 


0 . 00 


c 


ATOM 


1345 


O 


GLU 


B 


20 


27 . 311 


20 . 147 


16 . 091 


0 . 00 


0 


ATOM 


1346 


CB 


GLU 


B 


20 


26 . 641 


18 . 934 


18 . 532 


0.00 


c 


ATOM 


1347 


CG 


GLU 


B 


20 




18 . 174 


19 . 836 


0 . 00 


c 


ATOM 


1348 


CD 


GLU 


B 


20 


26.391 


16.720 


19.643 


0.00 


c 


ATOM 


1349 


OE1 


GLU 


B 


20 


26.614 


16.043 


20.673 


0.00 


Ol- 


ATOM 


1350 


OE2 


GLU 


B 


20 


26.569 


16.221 


18.501 


0.00 


o 


ATOM 


1351 


H 


GLU 


B 


20 


25.129 


20.696 


18.442 


0.00 


H 


ATOM 


1352 


N 


ALA 


B 


21 


29.122 


21.069 


17.024 


0.00 


N 


ATOM 


1353 


CA 


ALA 


B 


21 


29.859 


21.221 


15.768 


0.00 


c 


ATOM 


1354 


C 


ALA 


B 


21 


30.422 


19.894 


15.208 


0.00 


c 


ATOM 


1355 


O 


ALA 


B 


21 


31.618 


19.821 


14.879 


0.00 


0 


ATOM 


1356 


CB 


ALA 


B 


21 


30.954 


22.295 


15.900 


0.00 


c 


ATOM 


1357 


OXT 


ALA 


B 


21 


29.677 


18.897 


15.088 


0.00 - 


Ol- 


ATOM 


1358 


H 


ALA 


B 


21 


29.585 


21.298 


17.880 


0.00 


H 


TER 


1359 




ALA 


B 


21 













EXAMPLE 21 
Oxidative Stability of ASP 

This Example describes experiments conducted to determine the oxidative stability 
of the ASP protease and mutant proteases. The resistance to oxidation of Cellulomonas 
69B4 protease was compared to that of: a BPN'-variant protease (BPN'-variant 1 ; 
Genencor; See, RE 34,606 [incorporated herein by reference], for a description of this 
enzyme); a GG36 variant protease (GG36-variant 1; Genencor; See e.g., U.S. Pat. Nos. 
5,955,340 and 5,700,676, herein incorporated by reference); and PURAFECT protease 
(Genencor). 

The assay was conducted by incubating a sample of the protease with 0.1 M H 2 0 2 . 
A 2.0 ml volume of 0.1 M Borate buffer (45.4 gm NaB 4 0 7 10 H 2 Q), pH 9.45 containing 0.1 
M H 2 0 2 and 100 ppm protease was incubated at 25°C for 20 minutes and assayed for 
enzyme activity. 

The enzyme activity was determined as follows: 50 \*\ of the incubation mixture was 
combined with 950 pi 0.1 M Tris buffer, pH 8.6 and a sample from 10 pi was taken and 
added to 990 jjI AAPF substrate solution, cone. 1 mg/ml, in 0.1 M Tris / 0.005% TWEEN®, 
pH 8.6. The rate of increase in absorbance at 410 nm due to release of p-nitroaniline was 
monitored. The results obtained for these proteases are provided in Figure 31. As 
indicated in this graph, protease 69B4 showed greatly enhanced stability under oxidative 
conditions relative to the subtilisin proteases. 
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EXAMPLE 22 
Chelate Stability of ASP 

In this Example, experiments to determine the chelate stability of ASP are described. 
The resistance to the presence of a chelator of 69B4 protease was assayed by incubating 
an aliquot of the enzyme with 10 mM EDTA in 50 mM Tris, pH 8.2. The same enzyme 
preparations as used in Example 21 were used in these experiments. 

in specific, a volume of 2.0 ml 50 mM Tris buffer, pH 8.2, containing 10 mM EDTA 
and 100 ppm protease was incubated at 45°C for 100 minutes and assayed for enzyme 
activity as follows: 50 pi of the incubation mixture was combined with 950 pi 0.1 M Tris 
buffer, pH 8.6 and a sample from 10 pi was taken and added to 990 pi AAPF substrate 
solution, cone. 1 mg/ml, in 0.1 M Tris / 0.005% TWEEN®, pH 8.6 

The rate of increase in absorbance at 410 nm due to release of p-nitroaniline was 
monitored. The results obtained for these four proteases are shown in Figure 32. As 
indicated by these results, protease 69B4 showed greatly enhanced stability in the presence 
of a chelator than BPN' variant-1, PURAFECT®, or GG36 variant-1. 

EXAMPLE 23 
Thermal Stability of ASP 

In this Example, experiments conducted to determine the thermostability of ASP 
protease are described. In one set of experiments, 69B4 protease was tested for 
resistance to thermal inactivation in solution. As in Examples 21 and 22, a BPN' variant 
(BPN'-varianM), PURAFECT®, and a GG36 variant (GG36-variant-1) were also tested and 
compared with ASP. 

The thermal inactivation was performed by incubating a volume of 2.0 ml 50 mM 
Tris buffer, pH 8.0, containing 100 ppm protease at 45°C for 300 minutes and assayed for 
enzyme activity as follows: 50 pi of the incubation mixture was combined with 950 pi 0.1 M 
Tris buffer, pH 8.6 and a sample from 10 pi was taken and added to 990 pi AAPF substrate 
solution, cone. 1 mg/ml, in 0.1 M Tris 70.005% TWEEN®, pH 8.6. The rate of increase in 
absorbance at 410 nm due to release of p-nitroaniline was monitored. The results of these 
four proteases are shown in Figure 33. As shown by these results, protease 69B4 showed 
enhanced or comparative thermal stability at 45 degrees centigrade than the BPN' variant, 
PURAFECT®, or the GG36 variant. 

In addition to the above experiments, an alternative method for determining the 
thermostability of ASP was also tested. In these experiments, a temperature gradient 
between 57°- 62 °C was used. The thermal inactivation (using a Thermocycler -MTP plate 
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DIMA Engine Tetad; MJ Research) was performed by incubating a volume of 180pl 100 mM 
Tris buffer, pH 8.6, containing 1 mM CaCI 2 and 5 ppm protease for 60 minutes and assayed 
for enzyme activity as follows: 10 pi was taken and added to 190 pi AAPF substrate 
solution, cone. 1 mg/ml, in 0.1 M Tris / 0.005% TWEEN®, pH 8.6. The rate of increase in 
absorbance at 410 nm due to release of p-nitroaniline was monitored (at 25°C). The results 
of 4 proteases are shown in Figure 34. 

EXAMPLE 24 
pH profile of ASP Protease on DMC Substrate 

In this Example, experiments conducted to determine the pH profile of the ASP protease are 
described. The Cellulomonas 69B4 protease of the present invention, isolated and purified by 
methods described herein and three currently used subtilisin proteases (PURAFECT®, BPN'-varianl 
1 , GG36-variant-1) described in -Examples 21-23, were analyzed for their ability to hydrolyze a 
commercial synthetic substrate, di-methyl casein ("DMC7 Sigma C-9801) in the pH range from 4 to 
12. 

The DMC method described at the beginning of the Experimental section was used, 
with modifications, as indicated below. Briefly, a 5 mg/ml DMC substrate solution was 
prepared in the appropriate buffer (5 mg/ml DMC, 0.005% (w/w) TWEEN-80® 
(polyoxyethylene sorbitan mono-oleate, Sigma P-1754)). The appropriate DMC buffers 
were composed as follows: 40 mM MES for pH 4 and 5 ; 40 mM HEPES for pH 6 and 7, 40 
mM TRIS for pH 8 and 9; and 40 mM Carbonate for pH 10, 1 1 and 12. 

For the determination, 180 |jJ of each pH-substrate solution was transferred into 96 
well microtiter plate and were pre-incubated at 37°C for twenty minutes prior to enzyme 
addition. The respective enzyme solutions (BPN'-varianM ; GG36-variant-1 ; PURAFECT®; 
and 69B4 protease) were prepared, containing about 25 ppm and 20 pi of these enzyme 
solutions. These enzyme solutions were pipetted into the substrate containing wells in order 
to achieve a 2.5 ppm final enzyme concentration in each well. The 96 well plate containing 
enzyme-substrate mixtures was incubated at 37°C and 300 rpm for one hour in an IKS- 
Multitron incubator/shaker. 

A 2,4,6-trinitrobenzene sulfonate ("TNBS") color reaction method was used to 
determine the amount of peptides and amino acids release from DMC substrate. The free 
amino groups (of the peptides and amino acids) react with 2,4,6-trinitro-benzene sulfonic 
acid to form a yellow colored complex. The absorbance was measured at 405 nm in a 
SpectraMax 250 MTP Reader. 
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The TNBS assay was conducted as follows. A 1 mg/ml solution of TNBS ( 5% 
2,4,6 trinitrobenzene sulfonic acid/Sigma-P2297) was prepared in reagent buffer A (2.4 g 
NaOH, 45.4 g Na 2 B 4 O7.10H2O dissolved by heating in 1000ml). Then, 60 pi per well were 
aliquoted into a 96-well plate and 10 pi of the incubation mixture described above were 
added to each well and mixed for 20 minutes at room temperature. Then, 200 pi of reagent 
B (70.4 g NaH 2 P0 4 -H 2 0 and 1 .2 g Na 2 S0 3 in 2000 ml) were added to each well and mixed 
to stop the reaction. The absorbance at 405 nm was measured in a SpectraMax 250 MTP 
Reader. The absorbance value was corrected for a blank (without enzyme). 
The data in Table 24-1 show the comparative ability of the 69B4 protease to hydrolyze such 
substrate versus proteases from a known mutant variants (BPNT variant-1 and GG36 
variant- 1). 

Also, as shown in Figure 35, the serine protease of the present invention showed 
comparative or increased hydrolysis of DMC substrate with an optimal DMC-hydroIysis 
activity over a broad pH range from 7 to 12. 



Table 24-1. TNBS Response 


Enzyme 


TNBS response (OD405 nm) 




pH4 


pH5 


pH6 


pH7 


pH8 


pH9 


pH10 


pH11 


pH12 


BPIST 
variant-1 


0.095 


0.174 


0.482 


0.749 


0.813 


0.847 


0.730 


0.683 


0.590 


GG36 
variant-1 


0.228 


0.172 


0.499 


0.740 


0.958 


1.062 


1.068 


1.175 


1.136 


Purafect® 


0.042 


0.202 


0.545 


0.783 


0.956 


1.130 


1.102 


1.188 


1.174 


69B4 


0.252 


0.218 


0.575 


0.742 


0.803 


0.965 


0.762 


0.741 


0.729 



EXAMPLE 25 
pH Stability of ASP Protease 

In this Example, experiments conducted to determine the pH stability of the ASP 
protease are described. As in Examples 21-24, two currently used subtilisin proteases 
(PURAFECT® and BPN'-varianM ) were also tested. 

The respective enzyme solutions (i.e., BPN'-varianM, PURAFECT®, and 69B4 
protease) were prepared containing 90 ppm protease in 0.1 M Citrate buffer, pH 3, 4, 5 and 
6. Then, 10 ml tubes containing 1 ml of buffered enzyme solutions were placed in a GFL 
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1083 water bath set at 25°C, 35°C and 45°C respectively, for 60 minutes. AAPF activity 
was determined for each enzyme sample at time 0 and 60 minutes as described above. The 
remaining enzyme activity was calculated and the results are provided in Table 25-1 below, 
and are shown in Figures 25-28). 

As indicated by the data in Table 25-1 , the ASP protease is exceptional stable at pH 
3, 4, 5, and 6, at temperatures between 25°C and 45°C, as compared to the BPN' variant-1 
and PURAFECT®. 



Table 25-1. pH Stability Data 


PH 


BPN' Variant-1 


PURAFECT® 


ASP 


25° 


35° 


45° 


25° 


35° 


45° 


25° 


35° 


45° 


pH3 


39 


1 


0 


42 


2 


0 


97 


109 


95 


pH4 


92 


35 


1 


55 


7 


0 


106 


105 


102 


pH5 


112 


82 


12 


95 


68 


8 


114 


115 


106 


pH6 


113 


99 


59 


104 


96 


63 


95 


104 


104 



EXAMPLE 26 
Stability and Specificity of ASP 

In this Example, experiments conducted to determine the stability and specificity 
differences between ASP, ASP mutants, and FNA are described. These experiments were 
performed by formulating liquid TIDE® detergent (Procter & Gamble) with calcium formate 
(an anionic surfactant titrant), borate (a P1 binder/inhibitor), and glycerol (water ordering), 
either independently of or in combination with each other. The enzyme was tested under 
these conditions and the residual enzyme activity was determined over time at a fixed 
temperature. 

The experiments are described in greater detail below. Unformulated liquid TIDE® 
detergent (i.e., without added enzyme stabilizing chemicals ) was divided into eleven 
aliquots. Then, glycerol, borax, or calcium formate were added to the detergent aliquots in 
the proportions shown in Table 26-1 . 



Table 26-1. Detergent Additives (%) 


Aliquot # 


% Glycerol 


% Borax 


% Calcium 
Formate 


1 


5 


0 


.1 
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2 


2.5 


1.5 


.05 


3 


5 


3 


0 


4 


0 


3 


0 


5 


2.5 


1.5 


.05 


6 


0 


0 


.1 


7 


0 


3 


.1 


8 


0 


0 


0 


9 


5 


0 


0 


10 


2.5 


1.5 


.05 


11 


5 


3 


.1 



Each aliquot was pre-warmed to 90T, and either FNA, ASP (wild-type) or an ASP 
R18 variant was added to approximately one gram per liter protease. After thorough mixing, 
a portion was removed and assayed for activity with synthetic AAPF-pNA substrate, as 
described above. After the assay, each aliquot was placed back into a 90T oven. The 
assay process was repeated over time, and the decline in activity at TO was plotted as a % 
TO activity remaining. 

Surprisingly, it was found that ASP did not have the same calcium formate or 
glycerol dependency as FNA. Furthermore, it was determined that borate (alone) had the 
most dramatic effect on stabilizing ASP. It was also found that the addition of stabilizing 
chemicals provided significant benefits to the wild-type ASP, as well as the ASP R18 variant, 
indicating that the variant site is independent of the borate-activated site. 

EXAMPLE 27 
LAS Stability of ASP 

In this Example, experiments conducted to determine the stability of ASP to anionic 
surfactants are described. LAS (linear alkyl sulfonate), an anionic surfactant, is a 
component of HDL detergents known to inactivate enzymes. The methods used are 
described above. 

It was determined that wild-type ASP incubated in LAS dissolved in Tris HCI pH 8.6 
is inactivated (See, Table 27-1, below). Further study revealed that inactivation is rapid 
(See, Table 27-2). As LAS is a negatively charged molecule, the hypothesis that 
electrostatic attraction of LAS with positively charged amino-acid side chains of ASP was the 
cause of the LAS sensitivity, was developed. To test this hypothesis, arginine residues 
(wild-type ASP contains no lysine residues), were mutated to other amino-acids. 

Incubation of these mutants in 0.05%(w/v) LAS in Tris HCI pH8.6, for one hour 
revealed that all arginine replacement mutants were more stable than wild-type ASP. In 
contrast, non-arginine replacement mutations that were also tested for LAS stability were 
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generally not improved compared to wild-type (See, Table 27-3). Subsequent multiple 
arginine replacement mutations revealed that the enzyme is substantially more stable than 
the wild-type enzyme, and more stable that single arginine replacement mutations (See, 
Table 27-4). 

Another anionic surfactant that is used in HDL detergents is AES. Wild-type ASP 
was found to be unstable in high concentrations of AES (See, Table 27-5). The mutant ASP 
R18 was found to be more stable than wild-type in AES (See, Table 27-5). Also, the rate of 
inactivation of activity by 5% AES was found to be higher for the wild-type than the ASP R18 
mutant (See, Table 27-6). These results confirm that replacement of arginine residues of 
ASP improves the stability of ASP in anionic detergents in general. It is not intended that 
the present invention be limited to any specific anionic detergents or mutations. Indeed, it is 
contemplated that various anionic detergents (as well as other detergents) will find use in 
the present invention, as will various ASP mutants. 



Table 27-1. Inactivation of ASP by LAS in Tris HCI pH 8.6 



%LAS (w/v) 



% Activity of Control 



Control (0 LAS) 



100 



0.01 
0.03 
0.06 
0.10 
0.30 
0.60 
1.00 



87 
77 
59 
47 
31 
20 
12 



Table 27-2. Time-course of ASP Inactivation by 0.1% LAS 



Time (sees) 



% Remaining Activity 



60 
120 
240 
600 



0 



100 

45 

26 

20 

11 
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Table 27-3. Stability of ASP and Single Mutants 
(Incubated 0.05% LAS in Tris HCI, pH 8.6, for 60 mins.) 



Mutant % Remaining Activity of 0 LAS Control 

Wild-type 18 

R14L 47 

R16I 49 

R16L 56 

R16Q 51 

R35F 43 

R127A 59 

R127K 31 

R127Q 52 

R159K 25 

T36S 11 

G65Q 22 

Y75G 7 

N76L 17 

S76V 17 



Table 27-4. Stability of ASP and Multiple Arginine Replacements 
(Incubated 0.05% LAS in Tris HCI, pH 8.6. for 60mins) 



Mutant % Remaining Activity of 0 LAS Control 

Wild-type 27.5 

ASP R-1 98.8 

ASP R-2 69.6 

ASP R-3 100.2 

ASPR-7 103.9 

ASP R-1 0B 98.9 

ASPR-18 100.9 

ASP R23 79.4 



In this Table, 

R-1=R16Q/R35F/R159Q 

R-2=R159Q 

R-3=R16Q/R123L 

R-7=R1 4L/R1 27Q/R1 59Q 

R-10B=R14L/R179Q 

R- 1 8=R1 23L/R127Q/R1 79Q. 

R-21 =R1 6Q/R79T/R127Q 

R-23=R16Q/R79T 
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Table 27-5. Inactivation of ASP and ASP Mutant R-18 by AES in Tris HCI pH8.6 
%Remaining activity of 0% AES control 

% AES(v/w) Wild-type ASP ASP R-1 8 

0 100 100 

1 70 94 
5 32 57 



Table 27-6. Time-course of ASP and Mutant R-18 Inactivation 
by 5% AES in Tris HCI, pH 8.6 

% Remaining Activity of 0% AES Control 

Time (Mins) Wild-type ASP ASP R-128 

0 100 100 

90 99 105 

4020 15 83 



EXAMPLE 28 

Determination of ASP Autolysis Sites in the Presence and Absence of LAS Detergent 

In this Example, experiments conducted to determine the ASP autolysis sits in the 
presence and absence of LAS are described. ASP autolysis was evaluated in a buffer with 
and without LAS (dodecylbenzene-sulfonic acid). Autolysis peptide assignments were made 
based on molecular weight and sequence of each peptide (from MS and MS/MS data, 
respectively). 

ASP (at concentration of 0.35ug/uL) was incubated (at 4°C) in a 100mM Tris pH 8.6 
with and without 0.1%LAS (dodecylbenzene-sulfonic acid). Aliquots were taken at time 
periods from 0 to 30 min of incubation and autolysis was terminated by an addition of TFA 
(final concentration 1%). Aliquots (10pL) were analyzed by liquid chromatography coupled 
with electrospray tandem mass spectrometry (LC-ESI-MS/MS). Peptides were resolved 
using an HPLC system (model 1100, Agilent Technologies) using a reversed-phase column 
(Vydac C4, 0.3mmlD x 150mm), and a gradient from 0 to 100% solvent B (0.1%formic acid 
in acetonitrile) in 60 min at a flow rate of 5pL/min (generated using a static split from a pump 
flow rate of 250uL/min). Solvent A consisted of 0.1% formic acid in water; and solvent B 
was 0.1% formic acid in acetonitrile. 

Mass spectra were acquired using ion trap mass spectrometer (model LCQ Classic, 
Thermo). The mass spectrometer was tuned for optimum detection of m/z of 785 and 
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operated with spray voltage of 2.5kV, and a heated capillary at 250°C. Mass spectra were 
acquired with injection time of 500 msec and 5 microscans. Tandem MS spectra were 
acquired in data-dependent mode, with the most intense peak selected and fragmented with 
a normalized collision energy of 35%. For relative peptide quantitation, peak areas were 
determined using vendor software. The identity of the autolysis peptides was determined 
using a database search program (TurboSequest, Thermo) run on a database containing 
ASP sequence. Database searches were performed with no enzyme selected, threshold of 
10000, dta file parameters (peptide m/z error of 1.7, group 11, minimum ion count 15), and 
database parameters (peptide error of 2.2, MS/MS ions error of 0.0, both B,Y ions). 

Without LAS in the sample buffer, ASP cleavages were primarily observed at the 
termini and in the middle of the molecule (positions Y9, F47, Y59, F165, Q174, Y176; See 
Table 28-1 , below). Relative quantitative data for observed peptides and intact ASP was 
plotted over the course of the experiment (See, Figure 25, Panel A). The majority of the 
ASP remained intact and only 1% was in the form of cleaved peptides (proteimpeptide ratio 
of 99:1) These data indicated that the majority of ASP remains intact, folded, and resistant 
to further autolylic cleavage. 

With 0.1% LAS in the sample buffer, ASP cleavages were observed thoughout the 
protein (positions Y9, T40, F47, Y57, F59, R61, L69, F165, Q174, Y176). The majority of 
the ASP was in the peptide form after 10min (See, Figure 25, Panel B). After 60 min, the 
proteimpeptide ratio was <1:99. These data indicate that ASP is totally unfolded in the 
presence of LAS detergent, thus extensive cleavage throughout the sequence was 
observed. The observed autolysis cleavage sites under the two conditions are summarized 
in the following Table. In this Table, the amino acids preceding and following the periods 
are the amino acids that immediately precede and follow the autolysis peptide. The 
sequence between the periods indicates the sequence of the autolysis peptides observed. 



Table 28-1. ASP Autolysis Peptides Observed With and Without 0.1% LAS 


Peptide Sequence 










Observed 


Start -End 


Calculated 


Measured 


Observed in 


in 






Mass (Da) 


Mass (Da) 


0.1%LAS 


0% LAS 


-.FDVIGGNAY.T (SEQ ID NO:631) 


[1-9] 


954.5 


954.4 


Y 


Y 


T.ANPTGTF.A (SEQ ID NO:632) 


[41-47] 


706.3 


706.3 


Y 


N 


F.AGSSFPGNDY.A (SEQ ID NO:633) 


[48-57] 


1013.4 


1013.3 


Y 


N 


F.AGSSFPGNDYAF.V (SEQ ID NO:634) 


[48-59] 


1231.5 


1231.4 


Y 


Y 
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R.TGAGVNLL.A (SEQ ID NO:635) 


[62-69] 


743.3 


743.4 


Y 


N 


F.FQPVNPI.L (SEQ ID NO:636) 


[166-172] 


813.4 




N 


N 


F.FQPVNPILQ.A (SEQ ID NO:637) 


[166-174] 


1054.6 


1054.5 


Y 


Y 


F.FQPVNPILQAY.G (SEQ ID NO:638) 


f1 66-1 76] 


1288.7 


1288.5 


Y 


Y 



EXAMPLE 29 

Use of Reversible Inhibitors to Reduce LAS-Induced Degradation of ASP 

In this Example, experiments conducted to assess the use of reversible inhibitors to 
reduce LAS-induced degradation of ASP are described. Benzamidine (BZA) is a known 
reversible inhibitor of serine proteases. Using the standard succ-AAPF-pNA assay as 
described above, BZA was shown to inhibit the activity of approximately 2|jg/ml ASP, with 
complete inhibition occurring at 1000mM (1M), as indicated in Table 29-1, below: 



Table 29-1 Inhibition of ASP 


BZA Cone. mM 


Assay Rate 


0 


0.83 


1 


0.85 \ 


10 


0.82 . 


100 


0.42 


1000 


0.02 



Approximately 200|jg/ml ASP was then incubated with 0.1% LAS and with, and 
without 1M BZA for up to 4 days. Enzyme activity was measured at different time points by 
addition of 10(jl incubated sample to 990 pi of assay solution. This reduces the BZA 
concentration to 10mM, which by reference to the table above is not inhibitory. Therefore, 
any loss of activity will be due to enzyme degradation. As indicated in the results below, 
enzyme incubated with 0.1% LAS and without BZA lost all activity (i.e., it was degraded), 
while enzyme incubated with 0.1% LAS and 1M BZA, retained activity over the 4 day time- 
course of the study, demonstrating that inhibition of ASP activity prevents degradation by 
LAS. 
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Table 29-2. Assay Rate Results for Enzyme Incubated with 0.1% LAS, 

With/Without BZA 


Time 


ASP+0.1%LAS 


ASP+0.1%LAS+1M BZA 


30 sees 


0.755 


0.761 


30 mins 


0.685 


0.781 


18 hrs 


0.067 


0.761 


4 days 


0.004 


0.853 



EXAMPLE 30 
Testing of Mutant ASPs 

In addition to the tests described above, tests were conducted on various mutants of 
ASP. The methods described above in Example 1 were used. In the following Tables, 
"Variant Code" provides the wild-type amino acid, the position in the amino acid sequence, 
and the replacement amino acid (i.e., "F001A" indicates that the phenylalanine at position 1 
in the amino acid sequence has been replaced by alanine in this particular variant). 

Keratin Hydrolysis 

The table (Table 30-1) below provides the keratin hydrolysis data obtained for the 
ASP variants which show activity on this substrate in the keratin assay as described above 
("Protease Assay with Keratin in Microtiter Plates"). The values are relative to wild type 
(WT) and calculated as described in the assay procedure. Values greater than 1 are 
indicative of better activity than WT ASP. 

Table 30-1. Keratin Hydrolysis Results 



Variant 
code 


Keratin 
hydrolysis 
relative 


F001T 


1.24 


F001D 


1.13 


F001H 


1.04 


F001M 


1.01 


F001E 


1.01 



V003L 


1.08 


I004E 


1.00 


N007L 


1.18 


A008E 


1.18 


A008G 


1.13 


A008D 


1.04 


T010N 


1.27 


T010E 


1.20 


T010D 


1.13 



T010G 


1.04 


1011 A 


1.01 


G012D 


1.17 


G013S 


1.16 


G013M 


1.03 


G013A 


1.01 


R014L 


1.52 


R014Q 


1.49 


R014I 


1.40 
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R014D 


1.36 


R014N 


1.29 


R014G 


1.28 


R014T 


1.21 


R014M 


1.21 


R014K 


1.18 


R014A 


1.12 


R014S 


1.12 


R014W 


1.07 


R014P 


1.04 


R014H 


1.03 


S015W 


1.20 


S015T 


1.05 


R016A 


1.04 


R016S 


1.03 


R016Q 


1.03 


1019V 


1.11 


N024E 


2.44 


N024A 


1.72 


N024T 


1.55 


N024Q 


1.40 


N024V 


1.28 


N024L 


1.26 


N024H 


1.26 


N024M 


1.14 


N024F 


1.05 


N024S 


1.03 


R035E 


1.60 


R035L 


1.47 


R035Q 


1.42 


R035F 


1.41 


R035A 


1.37 


R035K 


1.26 


R035T 


1.22 


R035H 


1.18 


R035M 


1.17 


R035Y 


1.16 


R035W 


1.13 


R035S 


1.12 


R035D 


1.07 


R035N 


1.03 


R035V 


1.02 


T036I 


6.82 


T036S 


1.34 


T036G 


1.34 


T036N 


1.22 


T036D 


1.16 


T036H 


1.13 


T036P 


1.03 


T036L 


1.01 
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A038R 


1.77 


A038D ! 


1.51 


A038H 


1.30 


A038N 


1.28 


A038F 


1.22 


A038L 


1.19 


A038S 


1.18 


A038Y 


1.17 


A038T 


1.10 


A038V 


1.07 


A038G 


1.03 


A038I 


1.01 


T040V 


1.11 


A041N 


1.17 


A041D 


1.17 


A041I 


1.07 


A041L 


1.03 


T044E 


1.03 


A048E 


1.09 


G049A 


1.36 


G049S 


1.26 


G049H 


1.16 


G049F 


1.13 


G049L 


1.04 


G049T 


1.00 


S051D 


1.33 


S051Q 


1.18 


S051H 


1.12 


S051V 


1.11 


S051T 


1.09 


S051M 


1.01 


G054D 


1.71 


G054E 


1.23 


G054N 


1.06 


G054L 


1.02 


G054I 


1.00 


N055E 


1.30 


N055F 


1.25 


N055Q 


1.05 


R061M 


1.20 


R061T 


1.16 


R061E 


1.16 


R061H 


1.10 


R061S 


1.09 


R061N 


1.08 


R061K 


1.07 


R061V 


1.01 


T062I 


1.00 


G063D 


1.18 


G063V 


1.07 



A064I 


1.40 


A064N 


1.21 


A064Y 


1.19 


A064L 


1.17 


A064V 


1.17 


A064H 


1.16 


A064F 


1.15 


A064P 


1.15 


A064T 


1.13 


A064Q 


1.13 


A064M 


1.13 


A064S 


1.11 


A064W 


1.09 


A064G 


1.01 


G065P 


1.42 


G065D 


1.29 


G065Q 


1.29 


G065S 


1.25 


G065T 


1.25 


G065V 


1.23 


G065L 


1.21 


G065Y 


1.16 


G065A 


1.05 


G065R 


1.02 


N067D 


1.36 


N067G 


1.20 


N067T 


1.12 


N067E 


1.12 


N067S 


1.10 


N067H 


1.09 


N067A 


1.08 


N067Q 


1.07 


N067L 


1.05 


L068H 


1.07 


L069S 


1.35 


L069H 


1.23 


L069V 


1.03 


A070D 


1.20 


A070H 


1.16 


A070G 


1.13 


A070S 


1.04 


Q071G 


1.20 


Q071H 


1.14 


Q071D 


1.13 


Q071S 


1.10 


Q071A 


1.07 


Q071N 


1.06 


Q071I 


1.06 


V072I 


1.11 


N073T 


1.95 
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N073S 


1.07 


N074G 


1.75 


Y075G 


1.42 


Y075F 


1.24 


S076D 


1.69 


S076V 


1.48 


S076E 


1.47 


S076Y 


1.45 


S076T 


1.25 


S076L 


1.25 


S076N 


1.24 


S076I 


1.22 


S076W 


1.17 


S076Q 


1.13 


S076A 


1.08 


G077T 


2.13 


G077S 


1.21 


G077N 


1.06 


G078D 


1.35 


G078A 


1.27 


G078S 


1.07 


G078N 


1.07 


G078V 


1.03 


G078T 


1.00 


R079G 


1.48 


R079D 


1.44 


R079P 


1.43 


R079A 


1.31 


R079E 


1.31 


R079L 


1.25 


R079V 


1.25 


R079T 


1.23 


R079M 


1.23 


R079S 


1.23 


R079C 


1.02 


V080L 


1.03 


Q081E 


1.22 


Q081D 


1.12 


Q081V 


1.10 


Q081H 


1.10 


Q081P 


1.01 


A083E 


1.27 


A083L 


1.05 


A083I 


1.03 


H085Q 


1.26 


H085T 


1.22 


H085L 


1.14 


H085M 


1.10 


H085A 


1.06 


H085S 


1.02 
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T086D 


1.33 


T086E 


1.24 


T086I 


1.08 


T086L 


1.07 


T086Q 


1.07 


T086G 


1.06 


T086A 


1.05 


T086N 


1.01 


A088E 


1.01 


A088F 


1.00 


P089E 


1.04 


V090P 


1.51 


V090S 


1.42 


V090I 


1.34 


V090T 


1.22 


V090N 


1.10 


V090A 


1.08 


V090L 


1.06 


S092G 


1.20 


S092A 


1.12 


S092C 


1.06 


A093D 


1.20 


A093S 


1.12 


A093E 


1.09 


S099N 


1.27 


S099V 


1.23 


S099D 


1.21 


S099T 


1.21 


S099I 


1.08 


T101S 


1.14 


W103M 


1.17 


T107E 


1.32 


T107S 


1.30 


T107V 


1.23 


T107H 


1.23 


T107M 


1.21 


T107I 


1.17 


T107N 


1.12 


T107A 


1.10 


T107Q 


1.03 


T107K 


1.01 


T109E 


1.36 


T109I 


1.11 


T109G 


1.10 


T109A 


1.10 


T109L 


1.08 


T109P 


1.05 


T109H 


1.03 


T109N 


1.00 


A110S 


1.10 



A110T 


1.03 


A110H 


1.01 


L111E 


1.08 


N112E 


1.61 


N112D 


I 1.42 


N112Q 


1.36 


N112L 


1.27 


N112V 


1.23 


N112Y 


1.20 


N112I 


1.13 


N112S 


1.06 


N112R 


1.04 


S113T 


1.21 


S114A 


1.12 


V115A 


1.15 


T116E 


1.34 


T116Q 


1.28 


T116F 


1.09 


T116S 


1.02 


T121E 


1.35 


T121D 


1.15 


T121S 


1.05 


R123E 


! 1.63 


R123D 


I 1.57 


R123I 


1.48 


R123F 


1.40 


R123A 


1.30 


R123L 


1.30 


R123Q 


1.29 


R123N 


1.24 


R123H . 


1.22 


R123T 


1.16 


R123Y 


1.15 


R123S 


1.12 


R123G 


1.11 


R123V 


1.09 


R123W 


1.07 


R123K 


1.07 


G124A 


1.06 


I126L 


1.06 


R127A 


1.38 


R127Q 


1.23 


R127H 


1.19 


R127S 


1.19 


R127K 


1.17 


R127Y 


1.15 


R127E 


1.14 


R127F 


1.11 


R127T 


1.04 


R127C 


1.01 
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T129S 


1.31 


A132S 


1.03 


P134A 


1.04 


S140A 


1.02 


L142V 


1.31 


A143N 


1.07 


N145E 


1.33 


N145D 


1.14 


N145T 


1.10 


N145S 


1.07 


N145Q 


1.07 


V150L 


1.01 


N157D 


1.01 


R159E 


1.61 


R159F 


1.37 


R159N 


1.30 


R159Q 


1.28 


R159D 


1.23 


R159K 


1.20 


R159C 


1.19 


R159S 


.1.10 


R159A 


1.10 


R159L 


1.09 


R159Y 


1.08 


R159H 


1.08 



R159V 


1.08 


R159G 


L 1.06 


R159M 


1.06 


T160E 


1.19 


T160D 


1.02 


G161K 


1.04 


T163D 


1.11 


T163I 


1.08 


T163C 


1.03^ 


Q167T 


1.02 


N170Y 


2.23^ 


N170D 


1.38 


N170L 


1.12 


N170A 


1.06 


N170C 


1.04 


N170G 


1.04 


I172T 


6.27 


A175E 


1.04 


G177M 


1.01 


R179V 


1.60 


R179T 


1.53 


R179D 


1.48 


R179N 


1.42 


R179E 


1.42 


R179M 


1.41 



R179A 


1.39 


R179I 


1.38 


R179K 


1.32 


R179Y 


1.27 


R179L 


1.11 


R179W 


1.06 


I181L 


1.96 


I181S 


1.07 


T182V 


1.14 


T182L 


1.02 


T183E 


1.19 


T183I 


1.17 


T183Q 


1.07 


T183D 


1.05 


D184E 


1.02 


S185N 


1.11 


S185D 


1.03 


S185M 


1.03 


S185G 


1.01 


G186N 


2.05 


S187H 


1.05 


S187E 


1.01 


S188E 


1.08 



DMC Assay 

The following table (Table 30-2) provides the variants with improved specific activity 
on casein. The activity on casein as substrate for all variants was determined as described 
above ("Protease Assay with Dimethylcasein (96 wells), With or Without Preheating of the 
Protease for Activity and Thermostability Assays"). The values in the table provide relative 
values for each variant compared to the activity of the WT enzyme (i.e., each value is the 
quotient of (variant activity)/(wild type activity)). Every variant with a value higher than 1 is 
better than WT. 



Table 30-2. DMC Assay Results 



Variant 
code 


Casein 
specific 
activity 
relative to 
wild type 


F001T 


1.19 



F001A 


1.11 


F001G 


1.00 


D002G 


1.24 


D002Q 


1.24 


D002A 


1.12 


D002H 


1.10 



D002N 


1.10 


V003L 


1.33 


V003I 


1.28 


V003T 


1.17 


I004V 


1.07 


I004Q 


1.02 
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N007L 


1.56 


N007S 


1.25 


N007A 


1.22 


N007H 


1.11 


N007I 


1.11 


N007V 


1.06 


A008G 


1.12 


A008K 


1.09 


Y009V 


1.06 


T010G 


1.18 


T010K 


1.12 


T010Q 


1.01 


101 1Q 


1.28 


1011 A 


1.26 


101 1T 


1.16 


101 1S 


1.11 


101 1L 


1.06 


G012W 


1.11 


G012R 


1.02 


G013M 


1.09 


G013S 


1.08 


R014E 


1.27 


S015F 


1.09 


S015A 


1.03 


1019V 


1.04 


N024A 


2.48 


N024E 


2.37 


N024T 


1.70 


N024Q 


1.70 


N024V 


1.62 


N024M 


1.48 


N024H 


1.45 


N024L 


1.34 


N024F 


1.21 


N024S 


1.10 


I028L 


1.16 


A030S 


1.11 


R035F 


1.20 


R035D 


1.01 


T036I 


14.08 


T036G 


2.46 


T036N 


2.13 


T036S 


2.08 


T036W 


1.84 


T036P 


1.69 


T036H 


1.67 


T036D 


1.61 


T036Y 


I 1.48 


T036V 


1.48 


T036R 


1.38 
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T036F 


1.36 


T036L 


1.33 


T036C 


1.12 


A038R 


3.72 


A038F 


1.45 


A038D 


1.39 


A038S 


1.38 


A038H 


1.36 


A038L 


1.30 


A038N 


1.24 


A038K 


1.17 


A038V 


1.17 


A038Y 


1.14 


A038I 


1.11 


A038I 


1.11 


A038G 


1.09 


A038T 


1.00 


T039A 


1.01 


T040V 


1.21 


T040S 


1.09 


A041N 


1.13 


A041I 


1.02 


N042H 


1.18 


N042K 


1.01 


T046K 


1.01 


F047I 


1.17 


F047M . 


1.13 


F047V 


1.01 


G049F 


1.32 


G049K 


1.16 


G049A 


1.16 


G049L 


1.12 


G049W 


1.08 


G049H 


1.07 


G049T 


1.06 


G049S 


1.01 


S051A 


1.47 


S051Q 


1.14 


S051F 


1.13 


S051H 


1.09 


G054D 


1.66 


G054R 


1.33 


G054L 


1.32 


G054H 


1.32 


G054K 


1.24 


G054M 


1.24 


G054A 


1.23 


G054I 


1.22 


G054Q 


1.21 


G054N 


1.06 i 



G054E 


1.03 


N055F 


1.54 


N055Q 


1.17 


N055K 


1.11 


N065H 


1.09 


N055E 


1.00 


Y057M 


1.00 


R061M 


1.20 


R061S 


1.08 


R061T 


1.02 


T062I 


1.22 


G063V 


1.21 


G063W 


1.12 


G063Q 


1.09 


G063D 


1.08 


G063H 


1.07 


G063R 


1.05 


A064W 


1.34 


A064H 


1.28 


A064N 


1.26 


A064Y 


1.26 


A064R 


1.22 


A064F 


1.21 


A064K 


1.19 


A064M 


1.19 


A064S 


1.18 


A064L 


1.18 


A064I 


1.16 


A064Q 


1.11 


A064T 


1.11 


A064V 


1.10 


A064P 


1.01 


A064G 


1.00 


G065P 


! 1.57 


G065R 


1.56 


G065V 


1.48 


G065Y 


1.46 


G065S 


1.40 


G065T 


1.38 


G065Q 


1.37 


G065L 


1.26 


G065A 


1.16 


G065H 


1.12 


G065I 


1.07 


G065D 


1.05 


V066H 


1.46 


V066D 


1.45 


V066I 


1.29 


V066L 


1.25 


V066E 


1.24 
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V066A 


1.23 


V066M 


1.10 


V066N 


1.10 


V066G 


1.08 


V066T 


1.03 


N067G 


1.38 


N067L 


1.30 


N067K 


1.29 


N067A 


1.25 


N067H 


1.22 


N067T 


1.19 


N067D 


1.18 


N067S 


1.16 


N067Q 


1.14 


N067R 


1.13 


N067Y 


1.12 


N067V 


1.12 


N067F 


1.11 


N067M 


1.06 


N067E 


1.05 


L068W 


1.10 


L068H 


1.05 


L068P 


1.04 


L069S 


2.13 


L069H 


1.60 


L069V 


1.27 


L069W 


1.14 


L069K 


1.05 


L069R 


1.02 


L069N 


1.01 


A070H 


1.53 


A070S 


1.33 


A070D 


1.24 


A070G . 


1.09 


A070P 


1.07 


A070W 


1.04 


Q071I 


1.46 


Q071K 


1.41 


Q071G 


1.40 


Q071M 


1.33 


Q071H 


1.28 


Q071A 


1.26 


Q071N 


1.26 


Q071S 


1.19 


Q071D 


1.16 


Q071F 


1.14 


Q071L 


1.11 


Q071R 


1.10 


Q071T 


1.06 


V072I 


1.17 
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N073T 


2.73 


N073S 


1.28 


N073H 


1.12 


N074G 


1.87 


Y075I 


1.37 


Y075G 


1.36 ; 


Y075F 


1.34 


S076W 


1.77 


S076Y 


1.69 


S076V 


1.51 


S076L 


1.44 


S076N 


1.20 


S076T 


1.18 


S076I 


1.18 


S076E 


1.17 


S076R 


1.14 


S076A 


1.13 


S076Q 


1.11 


S076K 


1.09 


S076K 


1.09 


S076H 


1.05 


G077T 


2.50 


G077S 


1.34 


G077Y 


1.21 


G077N 


1.18 


G077Q 


1.02 


G077R 


1.02 


G078A 


1.64 


G078S 


1.35 


G078H 


1.31 


G078T 


1.29 


G078D 


1.25 


G078N 


1.23 


G078I 


1.19 


G078V 


1.19 


G078R 


1.18 


G078M 


1.01 


R079P 


1.24 


R079G 


1.20 


V080H 


1.24 


V080L 


1.22 


V080F 


1.15 


Q081V 


1.33 


Q081K 


1.30 


Q081H 


1.24 


Q081I 


1.13 


Q081D 


1.11 


Q081P 


1.07 


Q081E 


1.03 


Q081R 


1.01 



A083N 


1.13 


A083M 


1.09 


A083G 


1.08 


A083L 


1.08 


A083H 


1.07 


A083I 


1.03 


A083E 


1.02 


A083V 


1.02 


H085Q 


1.41 


H085T 


1.26 


H085R 


1.22 


H085L 


1.22 


H085K 


1.15 


H085M 


1.01 


T086A 


1.21 


T086G 


1.08 


T086N 


1.08 


T086I 


1.08 


T086L 


1.08 


T086E 


1.03 


T086K 


1.03 


T086H 


1.02 


A088K 


1.05 


P089N 


1.19 


P089V 


1.05 


P089Y 


1.02 


P089T 


1.00 


V090P 


1.62 


V090I 


1.30 


V090S 


1.26 


V090A 


1.12 


V090T 


1.11 


V090L 


1.10 


V090F 


1.02 


S092G 


1.25 


S092C 


1.07 


A093Q 


1.08 


A093T 


1.07 


A093H 


1.01 


S099T 


1.02 


G102Q 


1.09 


W103M 


1.54 


W103I 


1.33 


W103Y 


1.01 


H104K 


1.22 


H104R 


1.04 


T107S 


1.17 


T107V 


1.14 


T107M 


1.12 


T107H 


1.12 
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T107R 


1.07 


T107K 


1.03 


T107N 


1.01 


T107Q 


1.01 


T109I 


1.30 


T109H 


1.23 


T109A 


1.22 


T109P 


1.20 


T109R 


1.19 


T109L 


1.19 


T109G 


1.16 


T109N 


1.09 


T109V 


1.07 


T109E 


1.06 


A110T 


1.11 


A110S 


1.11 


N112I 


1.11 


N112R 


1.08 


N112G 


1.06 


N112L 


1.04 


N112Q 


1.03 


N112H 


1.00 


S114G 


1.37 


T116F 


1.45 


T116R 


1.06 


T116H 


1.04 


T116G 


1.01 


P118A 


1.45 


P118F 


1.39 | 


P118R 


1.37 


P118H 


1.24 


P118I 


1.19 


P118Q 


1.17 


P118K 


1.16 


P118E 


1.13 


P118G 


1.00 


E119R 


1.94 


E119K 


1.28 


E119Q 


1.04 


E119G 


1.02 


E119L 


1.00 


R123E 


1.20 


R123I 


1.11 


R123K 


1.05 


R123D 


1.03 


I126L 


1.20 


R127F 


1.20 


T129S 


1.20 


E133Q 


1.10 


P134R 


1.06 



-278- 



S140G 


1.03 


L142V 


1.12 


L142M 


1.08 


A143N 


1.12 


A143S 


1.11 


N145I 


1.26 


N145Q 


1.25 


N145E 


1.24 


N145G 


1.16 


N145T 


1.14 


N145L 


1.11 


N145S 


1.07 


N145F 


1.04 


N145R 


1.04 


N145P 


1.00 


Q146D 


1.06 


V150L 


1.26 


V150M 


1.14 


T151L 


1.13 I 


S155H 


1.01 


R159F 


1.49 


R159E 


1.10 


R159Y 


1.07 


R159K 


1.04 


R159N 


1.01 


G161K 


1.08 


T163I 


1.13 


F166Y 


1.07 


Q167N 


1.16 


Q167E 


1.09 


N170Y 


2.76 


N170D 


1.15 


N170L 


1.12 


N170A 


1.11 


N170C . 


1.05 


N170R 


1.03 


N170P 


1.01 


P171T 


1.02 


Q174I 


1.08 


Q174L 


1.02 


A175V 


1.04 


A175T 


1.02 


A175H 


1.02 


G177M 


1.42 


G177S 


1.09 


G177R 


1.04 


R179V 


1.63 


R179M 


1.36 


R179D 


1.33 


R179I 


1.31 



R179N 


1.29 I 


R179Y 


1.29 


R179T 


1.27 


R179L 


1.23 


R179K 


1.23 


R179A 


1.22 


R179E 


1.22 


R179W 


1.06 


R179F 


1.06 


T182V 


1.20 


T182W 


1.02 | 


T182Q 


1.01 


T183I 


1.35 


T183K 


1.19 


T183M 


1.14 


T183R 


1.09 


T183L 


1.07 


T183Q 


1.07 


T183E 


1.05 


T183H 


1.02 


D184F 


1.18 


D184R 


1.18 


D184H 


1.14 


D184Q 


1.10 


D184T 


1.03 


D184I 


1.03 


D184V 


1.01 


S185I 


1.15 


S185V 


1.11 


S185W 


1.09 


S185N 


1.07 


S185K 


1.06 


S185P 


1.03 


S185L 


1.02 


P189Y 


1.06 


P189W 


1.02 


P189R 


1.01 


I181H 


1.37 


I181G 


1.12 


I181N 


1.15 


G186V 


1.49 


G186E 


1.54 


G186I 


1.41 


G186L 


1.05 


G186N 


1.01 


S187P 


1.63 


S187E 


1.12 


S187T 


1.29 


S187L 


1.12 I 


S188M 


1.25 
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[S188L | 1.04 I 
Thermostability Assays 

The data in the following table (Table 30-3) represent the relative thermostability 

data of variants of ASP relative to the stability of the WT ASP stability under these 

conditions. The stability was measured by determining casein activity before and after 

incubation at elevated temperature (See, 'Thermostability Assays" above). The table 

contains the relative thermostability values compared to WT under these conditions. It is the 

quotient of (Variant residual activity/WT residual activity). A value greater than one 

indicates higher thermostability. 



Variant 
code 


Thermo 
stability 
relative 


V003R 


1.53 


I004D 


1.89 


I004P 


1.89 


I004G 


1.66 


A008G 


1.16 


Y009E 


2.04 


Y009P 


2.04 


T010Y 


1.64 


T010F 


1.53 


T010W 


1.49 


T010L 


1.26 


T010C 


1.21 


T010E 


1.10 


T010D 


1.09 


T010M 


1.06 


T010V 


1.06 


T010S 


1.03 


G012D 


1.86 


G012A 


1.15 


G012H 


1.14 


G012V 


1.06 


G012I 


1.06 


G012S 


1.00 


R014H 


1.08 


R014I 


1.08 


R014K . 


1.08 


R014N 


1.08 


R014Q 


1.08 


R014S 


1.08 


R014T 


1.08 


S015Q 


1.23 


S015R 


1.23 



Table 30-3. Thermostability Assay Results 



S015C 


1.22 


S015T 


1.16 


S015N 


1.16 


S015H 


1.13 


S015F 


1.07 


S015A 


1.04 


S015M 


1.04 


S015I 


1.03 


R016K 


1.07 


R016I 


1.06 


S018E 


2.18 


A022C 


2.27 


A022S 


1.94 


A022T 


1.55 


N024T 


1.49 


N024S 


1.25 


N024E 


1.12 


N024G 


1.12 


N024Q 


1.04 


N024K 


1.04 


N024A 


1.01 


N024V 


1.01 


G025S 


1.25 


G026I 


2.50 


G026K 


2.50 


G026L 


2.50 


G026Q 


2.50 


G026V 


2.50 


G026W 


2.50 


G026E 


2.11 


F027V 


2.50 


F027W 


2.50 


F027I 


1.36 


I028P 


2.50 


I028W 


1.99 



I028T 


1.78 


T029E 


2.50 


A030M 


2.13 


A030N 


2.13 


A030P 


1.75 


A030Y 


1.57 


G031M 


2.13 


G031H 


1.65 


G031V 


1.63 


G031N 


1.55 


G031A 


1.15 


H032A 


1.37 


H032C 


1.01 


H032R 


1.01 


C033M 


2.13 


C033L 


2.04 


C033N 


1.85 


C033E 


1.85 


C033D 


1.36 


C033T 


1.01 


C033K 


1.01 


R035H 


1.08 


R035Q 


1.08 


R035V 


1.08 


R035W 


1.08 


R035H 


1.08 


R035T 


1.08 


R035Y 


1.05 


T036V 


1.13 


T036I 


1.09 


T036K 


1.08 


T036P 


1.08 


A038D 


1.60 


A038C 


1.43 


A038Y 


1.07 
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T039R 


1.72 


T039V 


1.19 


T039Q 


1.11 


T039K 


1.07 


T039W 


1.07 


T039L 


1.03 


T039P 


1.03 


T040D 


2.33^ 


T040Q 


2.33 


T040H 


2.24 


T040P 


1.73 


T040N 


1.55 


T040G 


1.07 


A041S 


1.31 


A041D 


1.07 


P043D 


2.33 


P043H 


2.33 


P043K 


2.33 


P043L 


2.33 


P043N 


2.12 


P043G 


1.53 


T044V 


1.03 


G045V 


2.06 


G045A 


1.82 


T046Y 


1.68 


T046V 


1.66 


T046W 


1.43 


T046F 


1.32 


T046Q 


1.01 


A048P 


1.96 


A048V 


1.05 


A048E 


1.04 


G049A 


1.22 


S051V 


1.32 


S051C 


0.99 


P053N 


1.00 


G054E 


1.00 


Y057N 


1.65 


Y057M 


1.55 


F059K 


2.17 


F059W 


1.33 


F059C 


1.07 


T062R 


1.92 


T062G 


1.44 


A070P 


1.89 


A070G 


1.43 


Q071Y 


1.35 


Q071A 


1.21 


Q071F 


1.06 


N073P 


2.08 



N074F 


1.36 


S076A 


1.00 


F5079T 


1.58 


R079V 


1.31 


R079M 


1.01 


Q081A 


1.92 


Q081S 


1.65 


Q081P 


1.57 


Q081G 


1.54 


Q081H 


1.52 


Q081D 


1.51 


Q081F 


1.43 


Q081E 


1.39 


Q081C 


1.13 


Q081T 


1.08 


A083H 


1.62 


A083M 


1.35 


A083E 


1.23 


A083F 


1.20 


A083R 


1.14 


A083S 


1.00 


G084C 


2.08 


G084P 


2.08 


G084V 


1.17 


G084M 


1.17 


T086S 


1.39 


T086I 


1.20 


T086M 


1.12 


T086A 


1.11 


T086H 


1.08 


T086D 


1.06 


T086N 


1.05 


T086V 


1.04 


A087S 


1.20 


A087E 


1.12 


P089W 


2.22 


P089A 


1.27 


V090A 


1.35 


V090M 


1.18 


V090I 


1.11 


V090T 


1.03 


G091L 


2.22 


G091K 


1.06 


S092T 


1.14 


A093S 


1.66 


A093D 


1.19 


A093Q 


1.06 


A093Q 


1.06 


A093N 


1.06 


A093G 


1.02 



R096C 


1.92 


R096F 


1.75 


R096E 


1.57 


S099A 


1.80 


S099G 


1.17 


T100A 


1.70 


T100D 


1.18 


T100Q 


1.16 


T100E 


1.08 


T101S 


1.14 


W103N 


1.20 


C105E 


1.89 


C105G 


1.89 


C105K 


1.89 


C105M 


1.89 


C105N 


1.89 


C105S 


1.89 


C105P 


1.72 


C105W 


1.69 


C105T 


1.28 


C105Y 


1.22 


C105A 


1.21 


C105L 


1.18 


T107S 


1.3Q 


T107L 


1.24 


T107Q 


1.24 


T107A . 


1.17 


T107F 


1.14 


T107R 


1.11 


T107K 


1.10 


T107H 


1.02 


T107M 


1.00 


A110G 


1.15 


L111K 


1.17 


L111R 


1.10 


N112D 


1.08 


N112E 


1.08 


N112G 


1.08 


N112H 


1.08 


N112Q 


1.08 


N112R 


1.07 


N112L 


1.03 


N112P 


1.03 


N112F 


1.01 


S113M 


1.08 


S113N 


1.08 


S113R 


1.08 


S113T 


1.08 


S113C 


1.04 


S113H 


1.01 
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S113F 


1.00 


S113I 


0.99 


V115I 


1.18 


V115L 


1.14 


V115T 


1.05 


T116Q 


1.13 


T116E 


1.09 


T116L 


1.03 


Y117K 


1.41 


Y117Q 


1.41 


Y117R 


1.41 


Y117V 


1.41 


P118T 


1.12 


P118R 


1.08 


P118Q 


1.03 


P118S 


1.02 


E119L 


1.24 


E119V 


1.03 


T121E 


1.54 


T121D 


1.23 


T121A 


1.15 


T121S 


1.05 


T121H 


1.03 


V122C 


1.02 


R123W 


1.73 


R123F 


1.67 


R123Y 


1.58 


R123N 


1.53 


R123L 


1.39 


R123I 


1.39 


R123T 


1.35 


R123Q 


1.20 


R123K 


1.18 


R123V 


1.11 


L125A 


1.45 


L125M 


1.38 


R127K 


1.41 


R127Q 


1.41 


R127F 


1.21 


R127Y 


1.09 


R127D 


1.03 


R127E 


1.03 


T128A 


1.89 


T128V 


1.89 


T128G 


1.88 


T128S 


1.48 


T128C 


1.47 


T129W 


2.50 


T129Y 


1.30 


V130T 


1.13 
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V130C 


1.07 


A132C 


1.19 


P134W 


1.18 


S137R 


1.92 


S140P 


1.88 


L141C 


I 1.33 


L142M 


\ 1.10 


A143H 


1.19 


G144A 


1.14 


G144V 


1.10 


G144D 


1.02 


G144I 


1.00 


G144E 


0.99 


Q146P 


1.53 


Q146Y 


1.02 


A147E 


2.00 


A147C 


1.08 


V150N 


1.12 


T151C 


1.30 


T151A 


1.07 


G153K 


1.23 


G153V 


1.23 


G154L 


1.17 


G154R 


1.14 


G154E 


1.13 


S155P 


1.92 


S155R 


1.92 


S155W 


1.78 


S155K 


1.69 


S155Y 


1.66 


S155F 


1.48 


S155T 


1.18 


S155V 


0.99 


G156I 


1.92 


G156L 


1.92 


G156P 


1.81 


G156V 


1.08 


G156E 


1.03 


C158H 


2.00 


C158G 


1.57 


C158M 


1.49 


R159K 


1.56 


R159T 


1.26 


R159V 


1.15 


R159Q 


1.14 


T160I 


1.48 


T160E 


1.27 


T160Q 


1.14 


T160L 


1.09 


T160D 


1.04 



T160R 


1.04 


G161L 


2.13 


G161V 


2.13^ 


G161I 


1.50 


G161K 


1.24 


G162P 


1.32 


G162L 


1.11 


T163I 


. 1.19 


T163V 


1.02 


T164G 


1.83 


T164L 


1.54 


F165T 


1.01 


F165D 


0.99 


F166S 


1.44 


F166C 


1.29 


F166A 


1.20 


F166G 


1.01 


Q167L 


1.79 


Q167N 


1.08 


P168Y 


1.45 


P168I 


1.17 


N170E 


1.32 


N170D 


1.17 


N170L 


1.06 


N170V 


0.99 


Q174H 


1.11 


Q174L 


1.06 


Q174R 


1.06 


Q174V 


1.03 


Y176P 


1.48 


Y176K 


1.06 


Y176D 


1.03 


G177N 


1.18 


G177K 


1.03 


R179K 


1.21 


M180L 


1.30 


T182L 


1.14 


T182V 


1.01 


T183P 


1.26 


T183I 


1.17 


T183A 


1.13 


T183S 


1.11 


T183V 


1.06 


D184E 


1.04 


S185R 


1.32 


S185Q 


1.08 


G186S 


1.65 


G186P 


1.23 


S187R 


1.02 


S187G 


1.00 
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S188A 


1.44 


S188E 


1.42 


S188V 


1.42 


S188T 


1.36 


S188M 


1.26 



S188G 


1.23 


S188C 


1.16 


S188H 


1.01 


P189S 


1.16 


P189S 


1.16 



P189D 


1.04 


P189K 


1.04 


P189Y 


1.03 


P189P 


0.99 



BMI-LVJ 1 Performance Assays 

The following table (Table 30-4) provides the data obtained for selected variants in 
the BMI-LVJ 1 performance assay (See, "Microswatch Assay for Testing Protease 
Performance"). The table shows performance indices, which where calculated as described 
above for the variants, which show improved performance compare to the WT enzyme. 
Those variants, which have a performance index greater than 1, have an improved 
performance. 



Table 



Variant 
code 


BMI US 
LVJ-1 
liquid 
detergent 
[pert. 
Index] 


F001T 


1.06 


D002Q 


1.14 


D002E 


1.05 


D002P 


1.01 


V003L 


1.24 


V003I 


1.12 


N007L 


1.14 


A008G 


1.09 


A008D 


1.07 


A008E 


1.04 


A008M 


1.03 


A008K 


1.01 


T010E 


1.10 


T010Q 


1.08 


T010L 


1.08 


T010D 


1.02 


T010G 


1.01 


101 1Q 


1.18 


1011 A 


1.13 


101 1T 


1.12 


I011S 


1.11 


101 1L 


1.06 



. BMI-LVJ 1 



G012D 


1.08 


G012Y 


1.07 


G012N 


1.05 


G012L 


1.03 


G012Q 


1.00 


R014I 


1.25 


R014M 


1.20 


R014L 


! 1.11 


ROME 


1.09 


R014N 


1.08 


R014P 


1.08 


R014G 


1.03 


R014Q 


1.03 


S015E 


1.09 ! 


S015G 


1.04 


R016Q 


1.14 


R016L 


1.14 | 


R016N 


1.11 


R016G 


1.11 


R016I 


1.09 


R016A 


1.09 


R016M 


1.03 


1019V 


1.02 


N024E 


1.36 


N024A 


1.31 


N024T 


1.21 


N024Q 


1.19 


N024V 


1.17 



Assay Results 



N024H 


1.14 


N024M 


1.13 


N024L 


1.12 


N024S 


1.07 


N024W 


1.00 


R035F 


1.23 


R035L 


1.14 


R035A 


1.06 


R035D 


1.03 


R035H 


1.03 


T036I 


9.16 


T036N 


1.77 | 


T036G 


1.64 


T036S 


1.61 


T036P 


1.49 


T036D 


1.41 


T036H 


1.25 


T036Y 


1.25 


T036L 


1.18 


T036W 


1.15 


T036F | 


1.05 


A038R 


1.55 


A038L 


1.18 


A038S 


1.16 


A038Y 


1.12 


A038N 


1.10 


A038D 


1.09 


A038F 


1.08 
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A038V 


1.06 


T040V 


1.10 


T040S 


1.02 


A041N 


1.09 


A041I 


1.04 


N042H 


1.11 


T046Q 


1.05 


F047V 


1.02 


A048Q 


1.26 


G049A 


1.15 


G049F 


1.11 


G049H 


1.09 


G049S 


1.07 


G049T 


1.02 


G049V 


1.01 


S050F 


1.04 


S051Q 


1.10 


S051T 


1.07 


S051D 


1.05 


S051A 


1.05 


S051V 


1.03 


S051M 


1.01 


S051H 


1.01 


G054D 


1.48 


G054Q 


1.17 


G054E 


1.16 


G054N 


1.14 


G054I 


1.14 


G054L 


1.11 


G054M 


1.09 


G054A 


1.08 


G054H 


1.00 


N055F 


1.07 


N055E 


1.01 


Y057M 


1.00 


R061V 


1.13 


R061K 


1.12 


R061M 


1.11 


R061H 


1.08 


R061S 


I 1.06 


R061T 


I 1.06 


T062I 


1.02 


A064H 


1.15 


A064F 


1.14 


A064Y 


1.13 


A064W 


1.10 


A064N 


1.10 


A064T 


1.09 


A064S 


1.08 


A064V 


1.06 
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A064Q 


1.04 


A064I J 


1.04 


A064L 


1.02 


A064G 


1.02 


G065P 


1.28 


G065Q 


1.27 


G065T 


1.19 


G065Y 


1.19 


G065S 


1.17 


G065L 


1.13 


G065V 


1.11 


G065R 


1.10 


V066H 


1.15 


V066D 


1.07 


V066E 


1.02 


N067L 


1.25 


N067S I 


1.23 


N067A 


1.19 


N067Y 


1.16 


N067G 


1.14 


N067V 


1.14 


N067Q 


1.13 


N067T 


1.10 


N067F 


1.08 


N067M 


1.07 


N067K 


1.06 


N067D 


1.02 


N067H 


1.02 


N067C 


1.01 


L068H 


1.02 


L069S 


1.29 


L069H 


1.14 


L069W 


1.06 


L069V 


1.02 


A070G 


1.12 


A070P 


1.09 


A070D 


1.01 I 


Q071I 


1.14 


Q071H 


1.13 


Q071F 


1.12 


Q071D 


1.11 


Q071L 


1.09 


Q071T 


1.06 


Q071Y 


1.04 


Q071S 


1.04 


Q071A 


1.03 


V072I 


1.04 


N073T 


1.67 


N074G 


1.28 


Y075G 


1.37 



Y075F 


1.30 


Y075I 


1.18 


S076W 


1.49 


S076L 


1.39 


S076Y 


1.37 


S076T 


1.30 


S076V 


1.30 


S076I 


1.25 


S076D 


1.22 


S076N 


1.20 


S076A 


1.16 


S076E 


1.14 


G077T 


1.48 


G077S | 


1.11 


G077N 


1.07 


G078D 


1.24 


G078A 


1.12 


G078N 


1.10 


G078H 


1.04 


G078S i 


1.02 


R079P 


1.20 


R079G 


1.13 


R079D 


1.12 


R079C 


1.02 | 


V080L 


1.11 i 


V080H 


1.09 


V080Q 


1.04 i 


Q081P 


1.22 


Q081V 


1.02 


Q081K 


1.01 


Q081H 


1.01 


A083N 


1.03 


A083E 


1.03 


H085Q 


1.42 


H085L 


1.30 i 


H085R 


1.23 


H085K 


1.19 


H085F 


1.13 


H085Y 


1.11 


H085T 


1.10 


H085M 


1.05 


H085V 


1.02 


T086N 


1.07 


T086D 


1.05 


T086R 


1.01 


T086Q 


1.00 


T086I 


1.00 


T086V 


1.00 


A088F 


1.12 


A088H 


1.04 
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P089D 


2.85 


P089N 


1 11 
i«ii 


P089V 

l vUC/ V 


1 07 

1 .V/ / 


P089T 


1 00 

1 .vu 


V090I 


1 12 

1 . 1 c 


VOQOP 

V U Jul 


1 10 

1 • 1 \J 


VOQOT 


1 06 


V090L 


1 01 


cnqpA 






1 11 

1*11 


A093T 


2 26 


AOQ^ 


2 15 


A093D 


1 10 


A093F 


1 06 


A093O 


1 05 


A0Q3H 

r\V/570rl 


1 00 


ROQfiK 


1 02 


qnqqw 

OV/\7v!/ V V 


1 50 


^OQQN 


1 38 


QOQQA 

OL/c/v7/A 


1 22 


^OQQn 


1 15 




1 14 

1 < 1 "T 




1 09 


qOQQF 


1 03 

1 • uo 


V 


1 01 




1 01 


VV 1 UUIVI 


1.01 


III \J*T\J 


1.05 


T107H 
■ i \j 1 1 1 


1.02 


T1 07N 


1.01 


T107S 


1.01 


T109E 


1.15 


T109N 


1.03 


T109I 


1.02 


L1 11 E 


1.10 


L111D 


1.07 


L111T 


1.01 


N112E 


1.14 


N112L 


1.11 


N112Q 


1.09 


N112D 


1.08 


N112G 


1.07 


N112A 


1.01 


N112H 


1.01 


S113A 


1.13 


S113G 


1.12 


S113M 


1.04 


S114G 


1.25 


S114A 


1.05 


V115T 


1.03 
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T116F 


1.15 


T116E 


1.02 


P118E 


1.02 


T121D 


1.13 


T121E 


1.11 


T121S 


1.04 


R123I 


1.24 


R123F j 


1.22 


R123L 


1.15 


R123Q 


1.12 


R123E 


1.10 


R123K 


1.07 


R123P 


1.03 


R123W 


1.02 


R123H 


1.01 


G124A 


1.01 


L125V 


1.06 


R127A 


1.44 


R127S 


1.36 


R127Q 


1.36 


R127K 


1.28 


R127L 


1.25 


R127H 


1.25 


R127Y 


1.23 


R127F 


1.22 


R127T 


1.19 


R127G 


1.16 


R127V 


1.01 


T129S 


1.03 


A132V 


1.22 


A132C 


1.03 


P134E 


1.14 


P134G 


1.06 


P134A 


1.01 


S140A 


1.07 


L142V 


1.09 


L142M 


1.02 


A143N 


1.21 


A143S 


1.05 


A143H 


1.01 


N145D 


1.06 


N145S 


1.02 


V150L 


1.07 


N157D 


1.17 


R159F 


1.63 


R159E 


1.43 


R159K 


1.29 


R159H 


1.28 


R159N 


1.22 


R159Y 


1.17 



R159D 


1.17 


R159V 


1.12 


R159C 


1.11 I 


R159L 


1.10 


R159A 


1.06 


R159W 


1.02 


T160E 


1.12 


T160D 


1.02 


G161K 


1.15 


G161E 


1.10 


T163D 


1.13 


T163I 


1.06 


N170Y 


1.34 


N170D 


1.09 


N170L 


1.08 


N170G 


1.03 


N170A 


1.00 


P171S 


1.03 


P171V 


1.01 


A175V 


1.05 


G177M 


1.03 


R179V 


1.19 


R179T 


1.11 


R179K 


1.10 


R179N 


1.09 


R179D 


1.07 


R179E 


1.06 


R179A 


1.03 


R179I 


1.01 


R179M 


1.01 


R179F 


1.00 


I181Q 


1.24 


I181H 


1.07 


I181T 


1.00 


T182V 


1.05 


T182L 


1.00 


T183I 


1.10 


T183V 


1.03 


T183S 


1.01 


D184F 


1.34 


D184H 


1.15 


D184R 


1.11 


D184T 


1.08 


D184I 


1.07 


D184Q 


1.06 


D184L 


1.05 


S185I 


1.09 


S185W 


1.08 


S185L 


1.05 


S185L 


1.05 
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S185M 


1.04 


S185G 


1.01 


G186V 


1.27 


G186E 


1.20 


G186I 


1.17 


G186L 


1.06 


G186T 


1.03 



G186Q 


1.01 


G186R 


1.00 


S187P 


1.24 


S187Q 


1.13 


S187E 


1.08 


S187T 


1.04 


S188Q 


1.06 



S188M 


1.03 


S188L 


1.01 


P189T 


1.05 


P189N 


1.02 


P189I 


1.01 ! 



BMI-Low pH Performance Assays 

The table below (Table 30-5) provides the data obtained for the ASP variants which 
show activity on this substrate in the microswatch assays under low pH conditions (See, 
Microswatch Assay for Testing Protease Performance) using TIDE®. The table provides 
performance indices, which were calculated as described above for the variants which show 
improved performance compared to WT. Variants that have a performance index greater 
than 1 have improved performance. 



Table 30-5. BMI-Low pH Performance Assays 



Variant 
code 


BMI US 
LVJ-1 
liquid 

detergent 
[perf. 
Index] 


F001T 


1.06 


V003L 


1.11 


V003I 


1.03 


I004M 


1.11 


N007L 


1.08 


A008R 


1.53 


A008V 


1.46 


A008T 


1.44 


A008S 


1.25 


A008E 


1.20 


A008L 


1.20 


A008N 


1.19 


A008H 


1.15 


A008P 


1.13 


A008D 


1.08 


A008Q 


1.07 


T010Q 


1.04 


T010L 


1.04 


T010D 


1.01 


101 1T 


1.14 


101 1S 


1.05 


G012D 


1.00 


R014L 


1.32 


R014M 


1.25 
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1.16 


R014Q 


1.16 


R014N 


1.07 


R014K 


1.05 


R014D 


1.01 


S015E 


1.05 


R016Q 


1.22 


R016L 


1.08 


R016I 


1.07 


R016W 


1.05 


R016N 


1.02 


1019V 


1.04 


N024E 


1.61 


N024A 


1.52 


N024T 


1.35 


N024Q 


1.25 


N024L 


1.21 


N024M 


1.15 


N024V 


1.15 


N024H 


1.14 


N024F 


1.06 


N024S 


1.03 


R035F 


1.36 


R035L 


1.21 


R035A 


1.14 


R035E 


1.13 


R035D 


1.08 


R035H 


1.07 


T036I 


9.02 



T036N 


1.69 


T036G 


1.63 


T036S 


1.59 


T036P 


1.41 


T036D 


1.28 


T036V 


1.19 


T036W 


1.07 


T036H 


1.06 


T036L 


1.02 


T036F 


1.02 


A038R 


1.89 


A038F 


1.41 


A038S 


1.32 


A038L 


1.26 


A038D 


1.25 


A038H 


1.20 


A038N 


1.13 


A038I 


1.10 


A038Y 


1.08 


A038V 


1.02 


A038T 


1.00 


T040V 


1.14 


T040S 


1.01 


A041N 


1.10 


A041I 


1.04 


F047I 


! 1.01 


A048E 


1.04 


G049L 


1.16 


G049A 


1.10 


G049F 


1.06 
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G049N 


1.06 


G049T 


1.04 


G049S 


1.04 


S051A 


1.35 


S051D 


1.25 


S051Q 


1.12 


S051F 


1.09 


S051T 


1.08 


S051H 


1.06 


G054D 


1.67 


G054I 


1.22 


G054L 


1.21 


G054E 


1.20 


G054Q 


1.16 


G054A 


1.16 


G054M 


1.10 


G054N 


1.06 


G054H 


1.01 


G054K 


1.01 


N055F 


1.69 


N055E 


1.35 


N055S 


1.25 


N055Q 


1.15 


N055V 


1.09 


N055T 


1.02 


F059W 


1.01 


R061M 


1.35 


R061T 


1.22 


R061V 


1.15 


R061S 


1.07 


R061N 


1.02 


R061K 


1.02 


R061Q 


1.02 


T062I 


1.14 


G063V 


1.25 


G063D 


1.18 


G063P 


1.13 


G063Q 


1.12 


A064N 


1.28 


A064H 


1.24 


A064S 


1.23 


A064Q 


1.21 


A064R 


1.19 


A064M 


1.15 


A064T 


1.15 


A064I 


1.14 


A064W 


1.14 


A064F 


1.11 


A064L 


1.11 


A064V 


1.09 
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A064K 


1.06 


A064Y 


1.05 


G065P 


1.66 


G065Q 


1.49 


G065S 


1.35 


G065Y 


1.32 


G065T 


1.26 


G065R 


1.22 


G065D 


1.16 


G065A 


1.12 


G065L 


1.05 


G065V 


1.04 


V066D 


1.21 


V066E 


1.08 


N067G 


1.41 


N067V 


1.39 


N067L 


1.32 


N067T 


1.31 


N067D 


1.25 


N067M 


1.25 


N067F 


1.24 


N067S 


1.24 


N067Y 


1.23 


N067C 


1.20 


N067A 


1.18 


N067Q 


1.13 


N067R 


1.11 


N067K 


1.07 


N067E 


1.07 


N067H 


1.06 


L068T 


1.03 


L068H 


1.01 


L069S 


1.79 


L069H 


1.64 


L069W 


1.26 


L069V 


1.21 


L069Q 


1.12 


A070S 


1.18 


A070P 


1.12 


A070G 


1.10 


Q071M 


1.15 


Q071D 


1.10 


Q071S 


1.03 


N073T 


1.77 


N074G 


1.61 


Y075G 


1.58 


Y075F 


1.40 


S076V 


1.71 


S076Y 


1.71 


S076I 


1.55 



S076D 


1.55 


S076L 


1.46 


S076W 


1.42 


S076N 


1.40 


S076E 


1.25 


S076C 


1.22 


S076T 


1.18 


S076Q 


1.17 


S076A 


1.11 


S076K 


1.07 


S076H 


1.00 


G077T 


1.86 


G077Q 


1.13 


G077N 


1.10 


G077S 


1.03 


G078D 


1.23 


R079P 


1.89 


R079C 


1.34 


R079G 


1.32 


R079E 


1.29 


R079D 


1.28 


R079L 


1.12 


R079A 


1.02 


Q081V 


1.31 


Q081I 


1.11 


Q081E 


1.10 


Q081H 


1.10 


Q081L 


1.07 


Q081K 


1.06 


Q081D 


1.06 


Q081A 


1.01 


A083N 


1.27 


A083I 


1.16 


A083D 


1.12 


A083M 


1.07 


A083L 


1.04 


A083E 


1.02 


A083G 


1.00 


H085Q 


1.24 


H085L 


1.19 


H085R 


1.12 


H085N 


1.08 


H085T 


1.08 


H085F 


1.05 


H085K 


1.04 


T086A 


1.27 


T086I 


1.24 


T086L 


1.22 


T086F 


1.21 


T086E 


1.15 
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T086M 


! 1.11 


T086D 


1.07 


T086C 


1.04 


T086Q 


! 1.04 


T086G 


i 1.03 


A088F 


1.15 


P089V 


1.07 


P089T 


I 1.07 


V090P 


1.50 


V090S 


1.38 


V090I 


1.32 


V090T 


1.23 


V090N 


1.22 


V090L 


1.16 


V090A 


1.07 


S092C 


1.20 


S092G 


1.12 


S092A 


1.02 


A093D 


1.31 


A093E 


1.30 


A093Q 


1.09 


R096K 


1.15 


T101S 


1.16 


W103M 


1.18 


W103Y 


1.18 


H104M 


1.01 


H104K 


1.00 


T107N 


1.42 


T107S 


1.30 


T107M 


1.24 


T107A 


1.20 


T107E 


1.20 


T107Q 


1.15 


T107H 


1.11 


T107V 


1.10 


T109E 


1.46 


T109I 


1.31 


T109A 


1.28 


T109G 


1.21 


T109H 


1.13 


T109N 


1.13 


T109L 


1.10 


T109F 


1.01 ! 


A110S 


1.19 


A110T 


1.13 


A110N 


1.05 


N112E 


1.24 


N112D 


1.20 


N112Q 


1.07 


N112A 


1.05 



N112L 


1.04 


S113A 


1.07 


S113G 


1.04 


S113M 


1.03 


S113E 


1.00 


S114G 


1.20 


S114T 


1.05 


S114A 


1.03 ! 


T116F 


1.12 


T116G 


1.06 


T116E 


1.06 


T116Q 


1.00 


P118E 


1.00 


T121E 


1.46 


T121D 


1.31 


T121L 


1.12 


T121G 


1.06 


R123E 


1.42 


R123D 


1.35 


R123I 


1.34 


R123F 


1.29 


R123L 


1.20 


R123P 


1.18 


R123Q 


1.14 


R123A 


1.12 


R123H 


1.12 


R123K 


1.11 


R123N 


1.01 


G124N 


1.04 


G124T 


1.00 


L125V 


1.17 


I126L 


1.26 


R127A 


1.38 


R127S 


1.31 


R127Q 


1.26 


R127L 


1.26 


R127K 


1.26 


R127H 


1.25 


R127Y 


1.21 


R127T 


1.19 


R127F 


1.18 


R127G 


1.06 


R127V 


1.04 


T129S 


1.20 


T129G 


1.14 


A132V 


1.19 


A132S 


1.08 


E133D 


1.06 


P134A 


1.25 


P134E 


1.23 



P134D 


1.15 


P134G 


1.09 


S140A 


1.15 


L142V 


1.28 


L142M 


1.02 


A143N 


1.25 


A143M 


1.03 


A143S 


1.03 


N145S 


1.36 


N145E 


i 1.32 


N145Q 


1.15 


N145G 


1.13 I 


N145P 


1.12 


N145T 


1.09 


N145L 


1.06 


N145F 


1.01 


Q146D 


1.12 ! 


Q146F 


1.02 ! 


T151V 


1.18 


N157D 


1.04 


R159F 


1.67 


R159E 


1.60 


R159C 


1.50 


R159Y 


1.31 


R159D 


1.30 


R159K 


1.25 


R159Q 


1.22 


R159N 


1.20 


R159H 


1.17 


R159A 


1.16 I 


R159L 


1.09 


R159V 


1.08 


R159W 


1.06 


R159P 


1.06 


R159M 


1.03 


T160E 


1.08 


G161E 


1.33 


G161K 


1.11 


G161Q 


1.05 


T163D 


1.25 


T163I 


1.00 


n63C 


1.00 


F166Y 


1.12 


P168S 


1.06 


N170Y 


2.54 


N170D 


1.20 


N170C 


1.19 


N170L 


1.06 


N170L 


1.06 


N170P 


1.02 
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N170H 


1.00 


P171M 


1.18 


P171V 


1.03 


1172V 


1.28 


A175T 


1.13 


A175V 


1.12 


A175F 


1.02 


Y176L 


1.08 


G177M 


1.62 


G177S 


1.08 


G177Q 


1.08 


R179V 


1.70 


R179M 


1.44 


R179I 


1.39 


R179Y 


1.37 


R179N 


1.35 


R179T 


1.30 


R179L 


1.30 


R179K 


1.30 


R179A 


1.30 


R179D 


1.27 



R179E 


1.22 


R179W 


. 1.20 


R179G 


1.08 


R179F 


1.06 


M180D 


1.31 


I181Q 


1.07 


I181C 


1.01 


I181L 


1.00 


I181T 


1.00 


T182V 


1.23 


T182W 


1.11 


T182L 


1.07 


T182Q 


1.06 


T182P 


1.05 


T183I 


1.25 


T183E 


1.16 


T183Q 


1.14 


T183K 


1.10 


T183L 


1.10 


T183A 


1.05 


T183D 


1.05 



T183V 


1.05 


T183R 


1.04 


T183M 


1.03 


D184F 


1.00 


G186E 


1.42 


G186V 


1.34 


G186I 


1.21 


G186L 


1.11 


G186P 


1.09 


G186T 


1.09 


G186A 


1.03 


S187P 


1.39 


S187T 


1.18 


S187E 


1.11 


S187L 


1.07 


S187Q 


1.04 


S187V 


1.02 


S188E 


1.09 


S188P 


1.04 



Scrambled Egg Assay (ADW) Performance 

The following table (Table 30-6) provides the data obtained for selected variants in 
the scrambled egg performance assay (See, "Scrambled Egg Assay") using Detergent 
Composition I. The table shows performance indices, which where calculated as described 
above for the variants, which show improved performance compare to the WT enzyme. 
Those variants, which have a performance index greater than 1, have an improved 
performance. 



Table 30-6. Scrambled Egg Assay Performance Results 



Variant 
code 


ADW 
[perf. 
Index] 


F001T 


1.00 


D002A 


1.06 


D002N 


1.05 


T010G 


1.36 


T010A 


1.25 


T010L 


1.14 


T010F 


1.03 


T010M 


1.03 


T010V 


1.03 


T010Q 


1.02 



T010S 


1.01 


1011 A 


1.20 


1011S 


1.20 


I011T 


1.18 


101 1L 


1.02 


G012I 


1.12 


G012Y 


1.08 


G012R 


1.05 


G012Q 


1.04 


R014M 


1.26 


R014G 


1.14 


R014A 


1.10 


S015G 


1.14 



S015F 


1.14 


S015E 


1.13 


S015H 


1.08 


R016K 


1.15 


R016N 


1.12 


R016A 


1.10 


R016H 


1.03 


1019V 


1.02 


A022V 


1.23 


N024E 


1.67 


N024T 


1.46 


N024Q 


1.31 


N024A 


1.28 
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N024L 


1.15 


N024V 


1.12 


N024H 


1.03 


G034A 


1.28 


T036I 


6.72 


T036S 


1.32 


T036G 


1.30 


T036N 


1.18 


T036V 


1.11 


T036W 


1.06 


T036Y 


1.05 


T036D 


1.02 


T036P 


1.02 


T036F 


1.01 


A038T 


1.27 


A038F 


1.24 


A038M 


1.00 


T046K 


1.10 


F047I 


1.05 


F047V 


1.02 


G049F 


1.17 


G049A 


1.13 


G049L 


1.10 


G049H 


1.05 


G049S 


1.02 


G049V 


1.01 


S051A 


1.13 


Y057M 


1.05 


G063V 


1.08 


G063W 


1.01 


G063D 


1.01 


A064H 


1.11 


A064R 


1.08 


A064Y 


1.07 


A064W 


1.07 


A064V 


1.06 


A064T 


1.05 


A064N 


1.05 


A064K 


1.04 


A064Q 


1.04 


A064L 


1.04 


A064I 


1.02 


G065V 


1.17 


G065T 


1.11 


G065S 


1.10 


G065L 


1.10 


G065A 


1.07 


G065P 


1.04 


G065D 


1.03 


L069S 


1.39 



L069H 


1.19 


L069V 


1.06 


A070S 


1.09 


Q071I 


1.15 


Q071F 


1.09 


Q071M 


1.05 


Q071H 


1.03 


Q071D 


1.02 


Q071L 


1.01 


N073T 


1.89 


N074G 


1.12 


Y075F 


1.10 


Y075G 


1.07 


S076W 


1.26 


S076V 


1.22 


S076Y 


1.21 • 


S076D 


1.13 


S076L 


1.12 


S076E 


1.09 


S076R 


1.09 


S076N 


1.08 


S076A 


1.07 


S076Q 


1.05 


S076H 


1.05 


S076T 


1.05 I 


S076I 


1.04 


S076K 


1.04 


G077T 


1.87 


G078T 


1.08 


G078A 


1.06 


G078S 


1.05 


G078R 


1.03 


G078D 


1.00 


R079L 


. 1.07 


R079G 


1.07 


R079S 


1.05 


R079T 


1.04 


R079V 


1.03 


R079D 


1.01 


R079A 


1.01 


V080A 


1.13 


V080L 


1.11 


H085T 


1.05 


T086Q 


1.03 


T086A 


1.02 


A088F 


1.04 


P089A 


1.03 


V090I 


1.17 


V090P 


1.13 


A093S 


1.04 



A093Q 


1.02 


S099N 


1.14 


S099V 


1.12 


S099Q 


1.05 


S099I 


1.01 


T107R 


1.13 


T107K 


1.12 


T107S 


1.10 


T107H 


1.09 


T107F 


1.09 


T107I 


1.07 


T107M 


1.07 


T107V 


1.06 


T107A 


1.06 


T107L 


1.04 


T107W 


1.02 


T109R 


1.07 


T109I 


1.06 


T109V 


1.02 


A110S 


1.01 


N112S 


1.31 


S114A 


1.11 


S114T 


1.09 


V115A 


1.04 


T116A 


1.10 


T116S 


1.03 


P118F 


1.06 


P118R 


1.05 


E119R 


1.27 


T121L 


1.05 


T121S 


1.03 


T121Q 


1.02 


G124T 


1.03 


L125Q 


1.02 


R127F 


1.15 


T128S 


1.10 


T129S 


1.11 


P134R 


1.89 


P134E 


1.49 


P134L 


1.48 


P134H 


1.29 


P134V 


1.23 


P134D 


1.13 


P134T 


1.11 


P134S 


1.08 


S140A 


1.33 


L142V 


1.24 


A143S 


1.06 


N145D 


1.01 


V150L 


1.12 
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T151L 


1.07 


G154S 


1.01 


R159F 


1.39 


R159K 


1.15 


R159Y 


1.06 


R159Q 


1.00 


G161K 


1.11 


T163I 


1.15 


T164G 


1.11 


F166Y 


1.13 


F166V 


1.07 


Q167N 


1.13 



N170Y 


1.24 


N170D 


1.03 


N170G 


1.02 


L178V 


1.08 


R179V 


1.16 


R179K 


1.10 


R179T 


1.05 


T182V 


1.22 


T182L 


1.04 


T183I 


1.08 


T183S 


1.02 


D184T 


1.05 



D184Q 


1.03 


G186S 


1.34 


G186E 


1.26 


G186V 


1.19 


G186I 


1.11 


G186A 


1.05 


G186L 


1.02 


S187E 


1.02 


S187T 


1.02 


S188A 


1.19 


S188M 


1.07 


S188G 


1.02 



Las Stability 

The following table (Table 30-7) shows all variants, which have an improved stability 
compared to the WT-ASP. All variants were tested and the calculations determined 
according to the protocol shown above (See, "LAS Stability Assay"). The table provides the 
residual activity after incubation for the variants. Under these conditions the average of the 
WT value was found to be 10.59% residual activity. All variants with a higher activity are 
improved with respect to the WT molecule. 



Table 30-7. LAS Stability Assay Results 



Variant 
code 


LAS 
stability 
[residual 
Activity 

(%)] 


F001P 


21.73 


F001N 


16.59 


F001R 


11.13 


D002P 


22.43 


D002I 


20.86 


D002V 


20.15 


D002T 


19.97 


D002M 


15.20 


D002N 


13.27 


D002F 


12.71 


D002A 


12.13 


D002C 


11.50 


A008G 


33.00 


A008T 


20.39 


A008R 


18.33 


A008P 


14.19 


T010L 


24.24 



T010C 


24.00 


T010Y 


20.40 


T010Q 


19.48 


T010D 


18.06 


T010E 


17.48 


T010F 


17.10 


T010M 


14.94 


T010W 


12.63 


101 1W 


50.85 


101 1E 


26.05 


101 1T 


23.20 


101 1Q 


22.59 


G012D 


41.99 


G012Q 


28.25 


G012N 


27.52 


G012V 


27.44 


G012S 


24.06 


G012I 


23.30 


G012H 


19.43 


G012Y 


16.33 


G012P 


15.10 



G012R 


13.43 


G012A 


12.15 


G012L 


11.15 


G012W 


10.66 


G013E 


18.82 


G013D 


16.72 


G013K 


10.79 


G013K 


10.79 


R014E 


71.80 


R014D 


64.85 


R014T 


45.51 


R014G 


31.47 


R014S 


30.62 


R014I 


26.03 


R014A 


25.60 


R014Q 


25.38 


R014C 


23.91 


R014N 


23.61 


R014M 


18.47 


R014H 


15.72 


R014L 


15.35 
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R014P 


12.43 


S015R 


57.77 


S015H 


53.39 


S015C 


50.38 


S015E 


25.99 


S015Y 


23.97 


S015M 


19.73 


S015F 


17.11 


S015N 


16.21 


S015G 


14.44 


S015L 


12.00 


S015A 


11.84 


S015T 


11.83 


S015I 


10.89 


R016E 


34.61 


R016T 


27.36 


R016C 


25.97 


R016V 


25.79 


R016D 


22.22 


R016Q 


19.87 


R016I 


19.83 


R016S 


10.71 


A022C 


27.48 


A022S 


25.99 


N024E 


23.54 


N024T 


18.16 


N024G 


15.54 


N024S 


14.04 


N024F 


13.05 


N024V 


11.86 ; 


I028V 


! 14.49 


R035E 


88.92 


R035D 


76.48 


R035Q 


49.08 ! 


R035V 


49.02 


R035S 


47.13 


R035T 


44.84 


R035N 


42.49 


R035A 


42.38 


R035C 


41.31 


R035P 


32.50 


R035H 


27.88 


R035M 


25.29 


R035K 


15.26 


T036C 


25.91 


T036V 


20.77 


A038D 


47.40 


A038C 


34.28 


A038T 


12.27 


A041D 


24.80 
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A041C 


23.37 


A041T 


18.58 


A041S 


15.58 


N042D 


15.04 


N042C 


13.16 


T044E 


i 33.74 


T044C 


17.24 


T046V 


40.22 


T046F 


34.46 


T046E 


34.01 


T046Y 


27.10 


T046C 


23.20 


F047R 


46.98 


F047V 


20.38 


F047I 


12.72 


A048E 


29.23 


G049C 


. 64.06 


G049Q 


49.53 


G049E 


48.76 


G049H 


47.79 


G049A 


43.93 


G049V 


43.28 


G049N 


29.58 


G049L 


24.93 


G049S 


19.86 


G049F 


16.65 


G049K 


15.46 


G049T 


11.73 


S051L 


19.79 


S051A 


15.12 


S051C 


14.59 


S051G 


14.33 


P053C 


11.51 


P053N 


10.68 


G054C 


26.41 


G054E 


19.88 


G054Q 


12.71 


G054K 


11.71 


N055G 


33.29 


N055A 


15.31 


D056L 


42.96 


D056F 


17.11 


Y057G 


27.33 


F059W 


31.25 


R061E 


30.95 


R061V 


26.22 


R061M 


26.01 


R061T 


23.33 


R061K 


20.21 


R061Q 


18.05 



G063D 


13.79 


A064C 


15.65 


G065D 


14.73 


V066N 


16.37 


A070M 


21.09 


A070G 


15.83 


A070P 


14.86 


Q071L 


11.17 


Y075W 


10.97 


G078H 


12.06 


R079T 


16.18 


R079V 


15.24 


R079L 


12.03 


V080E 


10.65 


Q081P 


18.28 


Q081G 


15.49 


Q081A 


14.60 


Q081E 


14.36 


Q081H 


14.02 


Q081S 


13.51 


Q081D 


13.17 


Q081Y 


13.15 


Q081F 


12.61 


Q081I 


11.93 


Q081W 


11.89 


Q081C 


11.40 


A083H 


17.04 


A083D 


15.14 


A083E 


14.66 


A083Y 


12.54 


A083V 


11.93 


A083N 


11.52 


A083M 


11.35 


A083F 


11.21 


A083I 


10.80 


H085P 


10.62 


T086E 


16.60 


T086I 


13.95 


T086C 


13.70 


T086W 


13.45 


T086V 


12.92 


T086Y 


10.97 


T086F 


10.78 


T086D 


10.70 


A087E 


20.99 


A087C 


17.19 


A087P 


11.78 


A088F 


18.06 


A088E i 


14.11 


A088V 


13.47 
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A088H 


10.95 


P089D 


10.88 


V090C 


12.71 


G091Q 


23.98 


S092T 


17.35 


S092I 


11.15 


S092C 


10.93 


S092L 


10.60 


A093H 


14.05 


S099A 


28.58 


S099G 


22.20 


S099K 


17.98 


S099Q 


17.50 


S099H 


15.09 


T100A 


27.16 


T100R 


22.31 


T100K 


22.07 


T100Q 


15.53 


T100C 


11.47 


W103L 


20.25 


H104M 


10.65 


T107R 


26.61 


T107H 


12.35 


T109E 


24.23 


T109K 


17.25 


N112P 


25.16 


N112E 


17.68 


N112D 


15.90 


S113C 


35.77 


S113A 


16.28 


S113D 


14.68 


S113H 


13.27 


S114C 


22.24 


S114E 


16.60 


S114D 


1 1 .86 


T116C 


16.41 


T116N 


14.90 


T116G 


14.42 


T116A 


11.29 


P118R 


28.25 


P118K 


23.28 


P118C 


16.70 


P118A 


15.98 


P118W 


15.50 


P118G 


14.55 


P118H 


13.73 


P118F 


12.80 


P118Y 


11.29 


E119G 


32.98 


E119Y 


29.43 
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E119R 


26.97 


E119T 


26.28 


E119V 


24.47 


E119N 


20.71 


E119A 


19.95 


E119L 


15.83 


E119S 


15.80 


E119Q 


14.68 


T121E 


36.49 


T121L 


34.33 


T121F 


23.82 


T121A 


17.78 


T121D 


16.73 


T121V 


14.25 


T121Q 


12.39 


T121G 


12.17 


T121S 


11.93 


T121N 


11.51 


R123D 


48.24 


R123Y 


47.97 


R123C 


46.46 


R123E 


44.33 


R123N 


40.60 


R123H 


39.41 


R123T 


34.97 


R123W 


33.83 


R123F 


30.58 


R123S 


30.56 


R123Q 


25.60 


R123V 


24.71 


R123M 


18.54 


R123A 


17.24 


R123K 


16.38 


R123G 


16.12 


R123I 


16.04 


G124D 


25.10 


G124N 


12.84 


L125Q 


25.77 


L125M 


14.90 


R127E 


36.18 


R127S 


31.24 


R127D 


29.46 


R127Q 


27.92 


R127K 


25.25 


R127A 


21.74 


R127C 


16.40 


R127T 


14.31 


R127Y 


13.61 


R127H 


12.89 


R127F 


10.69 



T128A 


21.49 


T128V 


12.94 


V130C 


12.97 


A132S 


19.09 


A132P 


11.71 


P134R 


22.20 


S140P 


21.06 


L141M 


18.59 


L141C 


12.46 


A143H 


10.95 


G144E 


12.63 


N145E 


12.29 


Q146D 


12.05 


T151L 


46.42 


T151C 


26.57 


T151V 


17.57 


S155C 


38.40 


S155W 


30.61 


S155Y 


23.95 


S155I 


22.60 


S155V 


2i.53 


S155E 


19.78 


S155T 


17.58 


S155F 


17.11 


S155Q 


12.59 


N157D 


18.83 


R159T 


28.61 


R159E 


27.00 


R159Q 


25.25 


R159D 


23.12 


R159V 


22.92 


R159S 


22.29 


R159K 


20.78 


R159N 


19.95 


R159C 


19.24 


R159A 


19.09 


R159M 


15.74 


R159L 


14.00 


R159H 


12.56 


R159Y 


11.23 


T160D 


15.18 


T160E 


11.72 


T163D 


23.84 


T163C 


19.09 


T163Q 


14.20 


T163R 


11.15 


F165W 


28.00 


F165E 


23.57 


F165H 


21.46 


F165S 


14.33 



WO 2005/052146 



PCT/US2004/039066 



293- 



0167F 


fid 13 
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12 59 


V16QA 
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12 75 


N170D 


2Q 08 


N170C 


2*3 07 


NI170I 


14 63 


INI / UVJJ 


1a ^n 


IN 1 / U/A 


1P 77 


Mi7np 

In 1 f ur 


1? 72 


11 79 A 


90 ACS 




16 62 




14. 76 


Q174T 


14.54 


Q174V 


13.40 


Q174H 


11.18 


A175T 


16.19 


G177D 


24.74 


G177E 


21.37 


G177C 


14.01 


G177N 


11.53 


R179E 


25.06 



R179D 


24.16 


R179C 


20.71 


R179V 


20.09 


R179I 


19.51 


R179T 


19.20 


R179Y 


17.89 


R179M 


16.74 


R179S 


16.12 


R179N 


16.11 


R179F 


15.67 


R179W 


15.56 


R179L 


15.12 


R179A 


14.35 


R179K 


12.30 


M180L 


25.64 


M180I 


12.31 


I181C 


11.51 


T182L 


12.63 


T183D 


13.51 


T183E 


13.32 


S185D 


14.31 



S185C 


13.10 


S185Y 


10.74 


S185N 


10.73 


G186E 


14.36 


G186P 


13.48 


G186C 


11.96 


S187E 


15.92 


S187F 


13.28 


S187L 


12.26 


S187C 


11.34 


S187W 


11.21 


S187G 


10.83 


S187A 


10.72 


S187V 


10.71 


S187H 


10.66 


S188E 


15.00 


S188C 


12.56 


S188T 


11.89 


S188G 


11.15 


S188V 


10.68 



EXAMPLE 31 
Determination of ASP Cleaning Activity 

In this Example, experiments conducted to determine the cleaning activity of ASP 
under various conditions, as well as the properties of the various wash conditions are 
described. 

There is a wide variety of wash conditions including varying detergent formulations, 
wash water volume, wash water temperature, and length of wash time. Thus, detergent 
components such as proteases must be able to tolerate and function under adverse 
environmental conditions. For example, detergent formulations used in different areas have 
different concentrations of their relevant components present in the wash water. For 
example, a European detergent typically has about 3000-8000 ppm of detergent 
components in the wash water, while a Japanese detergent typically has less than 800 (e.g., 
667 ppm) of detergent components in the wash water. In North America, particularly the 
United States, detergent typically have about 800 to 2000 (e.g., 975 ppm) of detergent 
comppnents present in the wash water. 

Latin American detergents are generally high suds phosphate builder detergents and 
the range of detergents used in Latin America can fall in both the medium and high 
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detergent concentrations, as they range from 1500 ppm to 6000 ppm of detergent 
components in the wash water. Brazilian detergents typically has approximately 1500 ppm 
of detergent components present in the wash water. However, other high suds phosphate 
builder detergent geographies, not limited to other Latin American countries, may have high 
detergent concentration systems up to about 6000 ppm of detergent components present in 
the wash water. 

In light of the foregoing, it is evident that concentrations of detergent compositions in 
typical wash solutions throughout the world varies from less than about 800 ppm of 
detergent composition ("low detergent concentration geographies"), for example about 667 
ppm in Japan, to between about 800 ppm to about 2000 ppm ("medium detergent 
concentration geographies"), for example about 975 ppm in U.S. and about 1500 ppm in 
Brazil, to greater than about 2000 ppm ("high detergent concentration geographies"), for 
example about 3000 ppm to about 8000 ppm in Europe and about 6000 ppm in high suds 
phosphate builder geographies. 

The concentrations of the typical wash solutions are determined empirically. For 
example, in the U.S., a typical washing machine holds a volume of about 64.4 L of wash 
solution. Accordingly, in order to obtain a concentration of about 975 ppm of detergent 
within the wash solution, about 62.79 g of detergent composition must be added to the 64.4 
L of wash solution. This amount is the typical amount measured into the wash water by the 
consumer using the measuring cup provided with the detergent. 

As a further example, different geographies use different wash temperatures. The 
temperature of the wash water in Japan is typically less than that used in Europe. For 
example, the temperature of the wash water in North America and Japan can be between 
10 and 30°C (e.g., about 20°C), whereas the temperature of wash water in Europe is 
typically between 30 and 50°C (e.g., about 40 a C). 

As a further example, different geographies may have different water hardness. 
Water hardness is typically described as grains per gallon mixed Ca 2 7Mg 2+ . Hardness is a 
measure of the amount of calcium (Ca 2+ ) and magnesium (Mg 2+ ) in the water. Most water in 
the United States is hard, but the degree of hardness varies from area to area. Moderately 
hard (60-120 ppm) to hard (121-181 ppm) water has 60 to 181 parts per million (/.e M parts 
per million converted to grains per U.S. gallon is ppm # divided by 17.1 equals grains per 
gallon) of hardness minerals. Table 31-1 provides ranges of water hardness. 



Table 31-1. Water Hardness Ranges 


Water 


Grains per Gallon 


Parts per Million 
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Soft 


less than 1 .0 


less than 17 


Slightly hard 


1.0 to 3.5 


17 to 60 


Moderately hard 


3.5 to 7.0 


60 to 120 


Hard 


7.0 to 10.5 


120 to 180 


Very hard 


greater than 10.5 


greater than 180 



European water hardness is typically greater than 10.5 (e.g., 10.5-20.0) grains per 
gallon mixed Ca 2+ /Mg 2+ (e.g., about 15 grains per gallon mixed Ca 2 7Mg 2+ ). North American 
water hardness is typically greater than Japanese water hardness, but less than European 
water hardness. For example, North American water hardness can be between 3 to10 
grains, 3-8 grains or about 6 grains. Japanese water hardness is typically lower than North 
American water hardness, typically less than 4, for example 3 grains per gallon mixed 
Ca 2+ /Mg 2+ . 

The present invention provides protease variants that provide improved wash 
performance in at least one set of wash conditions and typically in multiple wash conditions. 

As described herein, the protease variants are tested for performance in different 
types of detergent and wash conditions using a microswatch assay (See above, and U.S. 
Pat. Appln. Ser. No. 09/554,992; and WO 99/34011, both of which are incorporated by 
reference herein). Protease variants are tested for other soil substrates also in a similar 
fashion. 

In the experiments conducted to determine cleaning activity of ASP, the following 
methods were used. Incubators (Innova 4330 Model Incubator, New Brunswick) was pre- 
warmed for 60 minutes to 40 9 C for "European" conditions and for 20 9 C for "Japanese" 
conditions. Blood-Milk-Ink swatches (EMPA 116) were obtained from the Swiss Federal 
Laboratories for Material Testing and from CFT Research, and were modified by exposure 
to 0.03 % hydrogen peroxide for 30 minutes at 60 Q C, then dried. Circles of 1/4" diameter 
were cut from the dried swatches and placed vertically, one per well, in a 96 well microplate. 

Protease samples of ASP were diluted in 10 mM NaCI, 0.005% TWEEN®-80 to 
provide the desired concentration of 10 ppm (protein). To provide "North American wash 
conditions," 1 gram per liter TIDE® laundry detergent (Procter & Gamble) without bleach 
was prepared in deionized water, and a concentrated stock of calcium and magnesium was 
added to result in a final water hardness value of 6 grains per gallon. To provide "European 
wash conditions," 7.6 gram per liter ARIEL® REGULAR laundry detergent (Procter & 
Gamble) without bleach was prepared in deionized water, and a concentrated stock of 
calcium and magnesium was added to result in a final water hardness value of 15 grains per 



WO 2005/052146 



PCT/US2004/039066 



-296- 

gallon. To provide "Japanese wash conditions," 0.67 gram per liter PURE CLEAN laundry 
detergent (Procter & Gamble) without bleach was prepared in deionized water, and a 
concentrated stock of calcium and magnesium was added to result in a final water hardness 
value of 3 grains per gallon. 

In yet another detergent composition to provide "Japanese wash conditions with 
North American detergent formulation," 0.66 gram per liter Detergent Composition III without 
bleach was prepared in deionized water, and a concentrated stock of calcium and 
magnesium was added to result in a final water hardness value of 3 grains per gallon. 

The detergent solutions were allowed to mix for 15 minutes and were then filtered 
through a 0.2 micron cellulose acetate filter. A 190 ul of the respective detergent solution 
was then added to the appropriate wells of a microplate. Then, 10 ul of the enzyme 
preparation were added to the filtered detergent in order to obtain a final concentration 0.25- 
3.0 ppm (micrograms per milliliter) of enzyme, for a total volume of 200 \*l The microplate 
was then sealed to prevent leakage, placed in a holder on an incubator/shaker set to 20 Q C 
and 350/400 RPM and allowed to shake for one hour. 

The plate was then removed from the incubator/shaker and an aliquot of 100pl of 
solution was removed from each well, and placed on a fresh Costar microtiter plate 
(Corning). The absorbance at 405 nm wavelength was read for each aliquot on a Microtiter 
plate reader (SpectraMax 340, Molecular Devices), and reported. The detergent 
composition and incubation conditions in the microswatch assay are set forth in Table 31-2. 



Table 31-2. Detergent Composition and Incubation Conditions 



Geography Detergent 


Water 
Hardness 


Enzyme 
dosage 


Temperatur 
e 


Swatch 


Powder 
detergent 












European 


Ariel Regular 
7.6 g/l 


I5gpg 
Ca/Mg=4/1 


0.25 - 3.0 
ppm 


40° 


Superfix 


North American 


Detergent 
Comp. Ill 
1.0 g/l 


6gpg 
Ca/Mg=3/1 


0.25-3.0 
ppm 


20° 


3K 


Japanese 


Pure Clean 
0.66 g/l 


3gpg 
Ca/Mg=3/1 


0.25 - 3.0 
ppm 


20° 


3K 


Japanese 


Detergent 
Comp. Ill 


3gpg 


0.25-3.0 
ppm 


20° 


3K 


(pseudo) 


0.66 g/l 


Ca/Mg=3/1 








Liquid detergent 


Liquid-Tide® 


6gpg 


0.25 - 3.0 
ppm 


20° 


3K 
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(1.5ml/L) I | | | | | 

The dose response curves depicting absorbance at 405 nm as a function of 
concentration (ppm in well), for PURAFECT® (Genencor), OPTIMASE® (Genencor), 
RELASE™ (Genencor; GG36-variant described above), and ASP are provided in Figures 
23-27). 

As indicated in Figure 26, under North American conditions, in liquid TIDE® 
detergent, the ASP protease showed enhanced cleaning performance as compared to 
PURAFECT®, RELASE™ and OPTIMASE™ proteases under the same conditions. Under 
Japanese conditions, in Detergent Comp. Ill powder (0.66 g/l), ASP showed enhanced or 
the same cleaning performance as compared to PURAFECT®, RELASE™ and 
OPTIMASE™ proteases under the same conditions (See, Figure 27). Under European 
conditions, in ARIEL® REGULAR powder detergent, the ASP protease showed enhanced 
cleaning performance as compared to PURAFECT®, RELASE™ and OPTIMASE™ 
proteases under the same conditions (See, Figure 28). In both tests, ASP and 
OPTIMASE™ provided results that were 2 to 10 times the absorbance at 405 nm as 
compared to PURAFECT® and RELASE™. Under Japanese conditions, in PURE CLEAN 
powder detergent (See, Figure 29), the ASP protease showed enhanced and comparative 
cleaning performance as compared to PURAFECT®, RELASE™ and OPTIMASE™ 
proteases under the same conditions. Under North American conditions, in Detergent 
Composition III powder detergent (See, Figure 30), the ASP protease showed enhanced or 
comparative cleaning performance as compared to PURAFECT®, RELASE™ and 
OPTIMASE™ proteases under the same conditions. 

EXAMPLE 32 
Liquid Fabric Cleaning Compositions 

This Example provides liquid fabric cleaning compositions that find use in 
conjunction with the present invention. These compositions are contemplated to find 
particular utility under Japanese machine wash conditions, as well as for applications 
involving cleaning of fine and/or delicate fabrics. Table 32-1 provides a suitable 
composition. However, it is not intended that the present invention be limited to this specific 
formulation, as many other formulations find use with the present invention. 
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Table 32-1. Liquid Fabric Cleaning Composition 


Component 


Amount (%) 


AE2.5S 


2.16 


AS 


3.30 


N-Cocoyl N-methyl glucamine 


1.10 


Nonionic surfactant 


10.00 


Citric acid 


0.40 


Fatty acid 


0.70 


Base 


0.85 


Monoethanolamine 


1.01 


1,2-Propanediol 


1.92 


EtOH 


0.24 


HXS 


2.09 


Protease.sup.1 


0.01 


Amylase 


0.06 


Minors/inerts to 100% 





EXAMPLE 33 
Liquid Dishwashing Compositions 

5 This Example provides liquid dishwashing compositions that find use in conjunction 

with the present invention. These compositions are contemplated to find particular utility 
under Japanese dish washing conditions. Table 33-1 provide suitable compositions. . 
However, it is not intended that the present invention be limited to this specific formulation, 
as many other formulations find use with the present invention. 



Table 33-1 . Liquid Dishwashing Compositions 


Component 


A 


B 


AE1.4S 


24.69 


24.69 


N-cocoyl N-methyl glucamine 


3.09 


3.09 


Amine oxide 


2.06 


2.06 


Betaine 


2.06 


2.06 


Nonionic surfactant 


4.11 


4.11 
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Hydrotrope 


4.47 


4.47 


Magnesium 


0.49 


0.49 


Ethanol 




7.2 


LemonEase 


0.45 


0.45 


Geraniol/BHT 




0.60/0.02 


Amylase 


0.03 


0.005 


Protease 


0.01 


0.43 


Balance to 100% 







EXAMPLE 34 
Liquid Fabric Cleaning Compositions 

The proteases of the present invention find particular use in cleaning compositions. 
For example, it is contemplated that liquid fabric cleaning composition of particular utility 
under Japanese machine wash conditions be prepared in accordance with the invention. In 
some preferred embodiments, these compositions comprise the following components 
shown in Table 34-1. 



Table 34-1. Liquid Fabric Cleaning Composition 


Component 


Amount (%) 


AE2.5S 


15.00 


AS 


5.50 


N-Cocoyl N-methyl glucamine 


5.50 


Nonionic surfactant 


4.50 


Citric acid 


3.00 


Fatty acid 


5.00 


Base 


0.97 


Monoethanolamine 


5.10 


1 ,2-Propanediol 


7.44 


EtOH 


5.50 


HXS 


1.90 


Boric Acid 


3.50 


Ethoxylated tetraethylenepentaimine 


3.00 


SRP 


0.30 
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Protease 


0.069 


Amylase 


0.06 


Cellulase 


0.08 


Lipase 


0.18 


Brightener 


0.10 


Minors/inerts to 100% 





EXAMPLE 35 
Granular Fabric Cleaning Compositions 

In this Example, various granular fabric cleaning compositions that find use with 
the present invention are provided. The following Tables provide suitable compositions. 
However, it is not intended that the present invention be limited to these specific 
formulations, as many other formulations find use with the present invention. 



Table 35-1. Granular Fabric Cleaning Compositions 


Component 


Formu 


ations 


A 


B 


C 


D 


Proteasel 


0.10 


0.20 


0.03 


0.05 


Protease2 






0.2 


0.15 


C13 linear alkyl benzene sulfonate 


22.00 


22.00 


22.00 


22.00 


Phosphate (as sodium tripolyphosphate) 


23.00 


23.00 


23.00 


23.00 


Sodium carbonate 


23.00 


23.00 


23.00 


23.00 


Sodium silicate 


14.00 


14.00 


14.00 


14.00 


Zeolite 


8.20 


8.20 


8.20 


8.20 


Chelant (diethylaenetriamine-petaacetic 
acid) 


0.40 


0.40 


0.40 


0.40 


Sodium sulfate 


5.50 


5.50 


5.50 


5.50 


Water 


Balance to 100% 




Table 35-2. Granular Fabric Cleaning Compositions 


Component 


Formu 


ations 


A 


B 


C 


D 


Proteasel 


0.10 


0.20 


0.30 


0.05 


Protease2 






0,2 


0.1 


C12 alkyl benzene sulfonate 


12.00 


12.00 


12.00 


12.00 


Zeolite A (1-10 micrometer) 


26.00 


26.00 


26.00 


26.00 


C12-C14 secondary (2,3) alkyl sulfate, Na 
salt 


5.00 


5.00 


5.00 


5.00 


Sodium citrate 


5.00 


5.00 


5.00 


5.00 


Optical brightenere 


0.10 


0.10 


0.10 


0.10 
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Sodium sulfate 


17.00 17.00 17.00 17.00 


Fillers, water, minors 


Balance to 100% 



The following laundry detergent compositions are contemplated to provide particular 
utility under European machine wash conditions. 

5 



Table 35-3. Granular Fabric Cleaning Compositions 



Component 


Formulations 


A 


B 


C 


LAS 


7.0 


5.61 


4.76 


TAS 






1.57 ! 


C45AS 


6.0 


2.24 


3.89 


C25E25 


1.0 


I 0.76 


1.18 


C45E7 






2.0 


C25E3 


4.0 


5.5 




QAS 


0.8 


2.0 


2.0 


STPP 








Zeolite 


25.0 


19.5 


19.5 


Citric acid 


2.0 


2.0 


2.0 


NaSKS-6 


8.0 


10.6 


10.6 


Carbonate I 


8.0 


10.0 


8.6 


MA/AA 


1.0 


I 2.6 


1.6 


CMC 


0.5 


0.4 


0.4 


PB4 




12.7 




Percarbonate 






19.7 


TAED 




3.1 


5.0 


Citrate 


7.0 






DTPMP 


0.25 


0.2 


0.3 


HEDP 


0.3 


0.3 


0.3 


QEA 1 


0.9 


1.2 


1.0 


Protease 1 


0.02 


0.05 


0.035 


Lipase 


0.15 


0.25 


0.15 


Cellulase 


0.28 


0.28 


0.28 


Amylase 


0.4 


.0.7 


0.3 


PVPI/PVNO 


0.4 




0.1 


Photoactivated 
bleach (ppm) 


15 ppm 


27 ppm 


27 ppm 


Brightener 1 


0.08 


0.19 


0.19 


Brightener 2 




0.04 


0.04 


Perfume 


0.3 


0.3 


0.3 


Effervescent 
granules (malic 
acid 40%, 

sodium 
bicarbonate 
40%, sodium 


15 


15 


5 
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carbonate 
20%) 








Silicon 
antifoam 


0.5 


2.4 


2.4 


Minors/inerts to 
100% 


Balance to 100% 



EXAMPLE 36 
Detergent Formulations 

5 In this Example, various detergent formulations which find use with ASP and/or ASP 

variants are provided. It is understood that the test methods provided in this section must 
be used to determine the respective values of the parameters of the present invention. 

In the exemplified detergent compositions, the enzymes levels are expressed by 
pure enzyme by weight of the total composition and unless otherwise specified, the 

10 detergent ingredients are expressed by weight of the total compositions. The abbreviated 
component identifications therein have the following meanings: 



LAS 

TAS 
CxyAS 

CxyEz 
CxyAEzS 



Nonionic 



QAS 
Silicate 
Metasilicate 
Zeolite A 

SKS-6 

Sulfate 

STPP 

MA/AA 



Table 36-1. Definitions Used in this Example 

Sodium linear C-| 1.-13 alkyl benzene sulfonate. 
Sodium tallow alkyl sulphate. 
Sodium C-) X - C-jy alkyl sulfate. 

c lx " c 1y predominantly linear primary alcohol condensed 
with an average of z moles of ethylene oxide. 
c 1x * c 1y sodium alkyl sulfate condensed with an average of 
z moles of ethylene oxide. Added molecule name in the 
examples. 

Mixed ethoxylated/propoxylated fatty alcohol e.g. Plurafac 
LF404 being an alcohol with an average degree of 
ethoxylation of 3.8 and an average degree of propoxylation of 
4.5. 

R2.N+(CH3)2(C2H40H) with R2 = C12-C14. 
Amorphous Sodium Silicate (SiC>2."Na20 ratio = 1.6-3:2:1). 
Sodium metasilicate (SiC>2:Na20 ratio = 1 .0). 
Hydrated Aluminosilicate of formula Nai2( A1 02 Si0 2)l2- 
27H 2 0 

Crystalline layered silicate of formula 5-Na2Si205 B 

Anhydrous sodium sulphate. 
Sodium Tripolyphosphate. 

Random copolymer of 4:1 acrylate/maleate, average 
molecular weight about 70,000-80,000. 
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AA 

Polycarboxylate 



BB1 
BB2 
PB1 
PB4 

Percarbonate 

TAED 

NOBS 

DTPA 

HEDP 

DETPMP 

EDDS 

Diamine 

DETBCHD 

PAAC 
Paraffin 

Paraffin Sulfonate 

Aldose oxidase 

Galactose oxidase 
Protease 



Amylase 



Lipase 



Sodium polyacrylate polymer of average molecular weight 
4,500. 

Copolymer comprising mixture of carboxylated monomers 
such as acrylate, maleate and methyacrylate with a MW 
ranging between 2,000-80,000 such as Sokolan commercially 
available from BASF, being a copolymer of acrylic acid, 
MW4,500. 

3-(3,4-Dihydroisoquinolinium)propane sulfonate 

1 -(3,4-dihydroisoquinolinium)-decane-2-suIfate 

Sodium perborate monohydrate. 

Sodium perborate tetrahydrate of nominal formula 

NaB0 3 .4H 2 0. 

Sodium percarbonate of nominal formula 2Na2CC>3.3H202 . 
Tetraacetyl ethylene diamine. 

Nonanoyloxybenzene sulfonate in the form of the sodium salt. 

Diethylene triamine pentaacetic acid. 

1 ,1 -hydroxyethane diphosphonic acid. 

Diethyltriamine penta (methylene) phosphonate, marketed by 

Monsanto under the Trade name Dequest 2060; 

Ethylenediamine-N,N'-disuccinic acid, (S,S) isomer in the form 

of its sodium salt 

Dimethyl aminopropyl amine; 1,6-hezane diamine; 1,3- 

propane diamine; 2-methyl-1,5-pentane diamine; 1,3- 

pentanediamine; 1 -methyl-diaminopropane. 

5, 12- diethyl-1,5,8,12-tetraazabicyclo [6,6,2] hexadecane, 

dichloride, Mn(II) salt 

Pentaamine acetate cobalt(lll) salt. 

Paraffin oil sold under the tradename Winog 70 by 

Wintershall. 

A Paraffin oil or wax in which some of the hydrogen atoms 

have been replaced by sulfonate groups. 

Oxidase enzyme sold under the tradename Aldose Oxidase 

by Novozymes A/S 

Galactose oxidase from Sigma 

Proteolytic enzyme sold under the tradename Savinase, 
Alcalase, Everlase by Novo Nordisk A/S, and the following 
from Genencor International, Inc: "Protease A" described in 
US RE 34,606 in Figures 1A, 1B, and 7, and at column 11, 
lines 11-37; "Protease B" described in US5,955,340 and 
US5,700,676 in Figures 1 A, 1B and 5, as well as Table 1 ; and 
"Protease C M described in US6,31 2,936 and US 6,482,628 in 
Figures 1-3 [SEQ ID 3], and at column 25, line 12, "Protease 
D" being the variant 

1 01 G/1 03A/1 041/1 59D/232V/236H/245R/248D/252K (BPIST 
numbering) described in WO 99/20723. 

Amylolytic enzyme sold under the tradename Purafect® Ox 
Am described in WO 94/18314, WO96/05295 sold by 

Genencor; Natalase®, Termamyl®, Fungamyl® and 
Duramyl®, all available from Novozymes A/S. 
Lipolytic enzyme sold under the tradename Lipolase Lipolase 
Ultra by Novozymes A/S and Lipomax by Gist-Brocades. 
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Cellulase 


: Cellulytic enzyme sold under the tradename Carezyme, 




Celluzvme and/or Endolase bv Novozvmes A/S. 


Pectin Lyase 


: Pectaway® anC | Pectawash® available from Novozvmes A/S. 


PVP 


: Polyvinylpyrrolidone with an average molecular weight of 




60,000 


PVNO 


Polwinvlovridine-N-Oxide with an averaae molecular weiaht 

. 1 \yt y vii i jr i f JT * 1 1 1 ■ 1 ~ waimw , will i i i i»* w >^ i v*yv ■ i i w w v4\j ■ i i» 




of 50 000 


PVPVI 


Cooolvmer of vinvlimidazole and vinvlovrrolidone. with an 




avprane molecular wpinht of 20 000 


Briohtener 1 


Disodium 4 4 , -bis(2-sulDhostvrvhbiohenvl 


fnilimnp antifoam 


PolvHimpthvlsiloxanp foam controller with siloxane- 

a | \J t y VJ 1 1 1 IvU 1 V iwllV/Aul 1 v> 1 vul 1 1 Owl 1 LI V/llvl VVI 11 1 ulivAUl Iw 




nxvalkv/lpnp cooolvmpr as riisoersino aopnt with a ratio of said 

wAyQIiNy 1 vl Iv wVL/wl V 1 1 1 VI UO UlwUul Oil IU UUvl 11 Will 1 C4 IUUV wl wUlw 




foam controller to said dispersing agent of 10:1 to 100:1. 


finds Sunnrpssor 


12% Silicone/silica 18% stearvl alcohol 70% starch in 

■ 1 tmm f \J Ulllv/Vl Iv/ UlllVU) 1 / V vJ IvU 1 y 1 UIWI Ivy 1 1 » V/ / w vJ IUI Wl 1 111 




nranular form 

U| dl IUIQI IV/llll. 


SRP 1 

Ui II 1 


Anionicallv end canoed oolv esters 

f\ i 1 1 \*j i iiwiiiy wi i vj vuuk/vvi iw/ w i y vvivi w. 


PEG X 


Polvpthvlpne nlvcoi of a molecular weiaht of x 

1 Uly vll IV >vl Iw Ulyvvlj wl OL 1 1 IvIvvUIUI VVvlUI 11 V*l *\» 


PVP K60 (6) 


Vinvlnv/rrolidonp homonolvmpr ^av/prsap MW 160 000^ 

v ii iy i\Jy 1 1 wiivjwi iv i ivi i i\jyj\jiy 1 1 ivi ^avvi a^u ivivv i uv| wv^ 


Jpffamine (fD ED-2001 

wwi icii I in ic >ii/ i— i— / t.vu i 


Canoed oolvethviene alvcol from Huntsman 

wUk/uvu ij \_/ 1 y vy u iy iv ■ i w y ■ y wi ■ i vi 1 1 i i \a ■ ■ l ■ i iui ■ 


Icarhom fi?} A C 
loduiioifi vc MO 


A hranr*hpH fllpohnl Qiilnhatp from Fnir*hpm 
r\ ui dl i ic?\J aiwUi iui ciii\yi oliiui laic iivjiii i_ inwiioiii 


MME PEG (2000) 


Monomethyl ether polyethylene glycol (MW 2000) from Fluka 




Chemie AG. 


DC3225C 


Silicone suds suppresses mixture of Silicone oil and Silica 




from Dow Corning. 


Ten A I~ 

TEPAE 


retreaetnyienepentaamine etnoxyiate. 


DTA 

D 1 A 


Del IZUll IdZUIc . 


Rptainp 


fCHQ^N + CHoCOO" 

iVh/i 13/3' ™ Vi#i i gv/vyv/ 


Sugar 


Inriustrv nraHp D-oIiicosp or food oradp sunar 

II ivjuou y uiouc i— / yiuvuou wi iuuu yi uuv ouym 




n -C alkvl N-methvl alucamide 

vy^ 2 ciirVjfi i » 1 1 i\7U iy i y luwtii i 


TPKFA 


wi2**^i4 toppeo wnoie cut Tany acias. 


oiay 


A hx/HratoH afiiminnmii cilir^^tp in 9 npnpral formiilA 
M liyUIctlUU dlUI 1 Ml IUI I IU olllwCUo III ct yt5i lt?lal lUIIIIUId 


AI 2 0 3 Si02 xH20. Types: Kaolinite, montmorillonite, atapulgite, 




illite, bentonite, halloysite. 


pH 


Measured as a 1% solution in distilled water at 20°C. 



The following Table (Table 36-2) provides liquid laundry detergent compositions that are 
prepared. 
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Table 36-2. Liquid Laundry Detergent Compositions 


Component 


1 


II 


III 


IV 


V 


LAS 


24.0 


32.0 


6.0 


8.0 


6.0 


C 12 - Ci5 AEi. 8 S 






8.0 


11.0 


5.0 


C 8 -C 10 amido propyl dimethyl amine 


2.0 


2.0 


2.0 


2.0 


1.0 


C 12 -Cu alkyl dimethyl amine oxide 


- 


- 


- 


- 


2.0 


Ci2*Ci5 AS ' 


- 


- 


17.0 


7.0 


8.0 


CFAA 


- 


5.0 


4.0 


4.0 


3.0 


C12-C14 Fatty alcohol ethoxylate 


12.0 


6.0 


1.0 


1.0 


1.0 


C12-C18 Fatty acid 


3.0 


- 


4.0 


4.0 


3.0 


Citric acid (anhydrous) 


6.0 


5.0 


3.0 


3.0 


2.0 


DETPMP 


- 


- 


1.0 


1.0 


0.5 


Monoethanolamine 


# 


# 


5.0 


5.0 


2.0 


Sodium hydroxide 


- 


- 


2.5 


1.0 


1.5 


Propanediol 


12.7 


14.5 


13.1 


10. 


8.0 


Ethanol 


1.8 


2.4 


4.7 


5.4 


1.0 


DTPA 


0.5 


0.4 


0.3 


0.4 


0.5 


Pectin Lyase 


- 


- 


- 


0.005 


- 


Amylase 


0.001 


0.002 


- 




- 


Cellulase 


- 


- 


0.0002 


- 


0.0001 


Lipase 


0.1 


- 


0.1 


- 


0.1 


ASP 


0.05 


0.3 


0.08 


0.5 


0.2 


Protease A 


- 


- • 


- 


- 


0.1 


Aldose Oxidase 


- 


- 


0.3 


- 


0.003 


DETBCHD 


- 


- 


0.02 


0.01 


- 


SRP1 


0.5 


0.5 




0.3 


0.3 


Boric acid 


2.4 


2.4 


2.8 


2.8 


2.4 


Sodium xylene sulfonate 






3.0 






DC 3225C 


1.0 


1.0 


1.0 


1.0 


1.0 


2-butyl-octanol 


0.03 


0.04 


0.04 


0.03 


0.03 


Brightener 1 


0.12 


0.10 


0.18 


0.08 


0.10 


Balance to 100% perfume / dye and/or water 



# added to product to adjust the neat pH of the product to about 4.2 for (I) and about 3.8 
for (II). 
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The following Table (36-3) provides hand dish liquid detergent compositions that are 
prepared. 



Table 36-3. Hand Dish Liquid Detergent Compositions 


Component 


I 


II 


hi 


IV 


V 


VI 


C12-C15 AE^sS 


30.0 


28.0 


25.0 


- 


15.0 


10.0 


LAS 


- 


- 


- 


5.0 


15.0 


12.0 


Paraffin Sulfonate 


- 


- 


- 


20.0 


- 


- 


C10-C18 Alkyl Dimethyl 
Amine Oxide 


5.0 


3.0 


7.0 


- 


- 


- 


Betaine 


3.0 




1.0 


3.0 


1.0 




C12 poly-OH fatty acid 
amide 


■ 




-* 


3.0 




1.0 


C14 poly-OH fatty acid 
amide 




1.5 










C11E9 


2.0 


- 


4.0 


- 


- 


20.0 


DTPA 


- 


- 


- 


- 


0.2 


- 


Tri-sodium Citrate dihydrate 


0.25 






0.7 






Diamine 


1.0 


5.0 


7.0 


1.0 


5.0 


7.0 


MgCI 2 


0.25 






1.0 






ASP 


0.02 


0.01 


0.03 


0.01 


0.02 


0.05 


Protease A 




0.01 










Amylase 


0.001 






0.002 




0.001 


Aldose Oxidase 


0.03 




0.02 




0.05 




Sodium Cumene 
Sulphonate 








2.0 


1.5 


3.0 


PAAC 


0.01 


0.01 


0.02 








DETBCHD i 








0.01 


0.02 


0.01 


Balance to 1 00% perfume / dye and/or water 



The pH of these compositions is about 8 to about 1 1 
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Table 36-4 provides liquid automatic dishwashing detergent compositions that are 
prepared. 



Table 36-4. Liquid Automatic Dishwashing Detergent Compositions 



i^Offipuneni 


S 


II 
II 


in 
in 


IV 


\i 
V 


STPP 


16 


16 


18 


16 


16 


Potassium Sulfate 




10 


8 




10 


1 ,2 propanediol 


6.0 


0.5 


2.0 


6.0 


0.5 


Boric Acid 


4.0 


3.0 


3.0 


4.0 


3.0 


CaCI 2 dihydrate 


0.04 


0.04 


0.04 


0.04 


0.04 


Nonionic 


0.5 


0.5 


0.5 


0.5 


0.5 


ASP 


0.1 


0.03 


0.05 


0.03 


0.06 


Protease B 








0.01 




Amylase 


0.02 




0.02 


0.02 




Aldose Oxidase 




0.15 


0.02 




0.01 


Galactose Oxidase 






0.01 




0.01 


PAAC 


0.01 






0.01 




DETBCHD 




0.01 






0.01 



Balance to 100% perfume / dye and/or water 



Table 36-5 provides laundry compositions which may be prepared in the form of 
granules or tablets that are prepared. 



Table 36-5. Laundry Compositions 



Base Product 


1 


II 


III 


IV 


V 


C 14 -C 15 AS or TAS 


8.0 


5.0 


3.0 


3.0 


3.0 


LAS 


8.0 




8.0 




7.0 


C12-C15AE3S 


0.5 


2.0 


1.0 






Ci2'Ci5E5 or E3 


2.0 




5.0 


2.0 


2.0 


QAS 








1.0 


1.0 


Zeolite A 


20.0 


18.0 


11.0 




10.0 


SKS-6 (dry add) 






9.0 






MA/AA 


2.0 


2.0 


2.0 






AA 










4.0 


3Na Citrate 2H 2 0 




2.0 








Citric Acid (Anhydrous) 


2.0 




1.5 


2.0 




DTPA 


0.2 


0.2 








EDDS 






0.5 


0.1 




HEDP 






0.2 


0.1 




PB1 


3.0 


4.8 






4.0 


Percarbonate 






3.8 


5.2 




NOBS 


1.9 










NACA OBS 






2.0 






TAED 


0.5 


2.0 


2.0 


5.0 


1.00 


BB1 


0.06 




0.34 




0.14 


BB2 




0.14 




0.20 




Anhydrous Na Carbonate 


15.0 


18.0 


8.0 


15.0 


15.0 
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Table 36-5. Laundry Compositions 



Base Product 


1 


II 


III 


IV 


V 


Sulfate 


5.0 


12.0 


2.0 


17.0 


3.0 


Silicate 


- 


1.0 


- 


- 


8.0 


ASP 


0.03 


0.05 


1.0 


0.06 


0.1 


Protease B 




0.01 








Protease C 








0.01 




Lipase 




0.008 








Amylase 


0.001 








0.001 


Cellulase 




0.0014 








Pectin Lyase 


0.001 


0.001 


0.001 


0.001 


0.001 


Aldose Oxidase 


0.03 




0.05 






PAAC 




0.01 






0.05 



Balance to 100% Moisture and/or Minors* 

* Perfume, Dye, Brightener / SRP1 / Na Carboxymethylcellulose/ Photobleach / MgS0 4 / 
PVPVI/ Suds suppressor /High Molecular PEG/Clay. 

Table 36-6 provides liquid laundry detergent formulations which are prepared. 



Table 36-6. Liquid Laundry Detergent Formulations 



Component 


1 


1 


II 


III 


IV 


V 


LAS 


11.5 


11.5 


9.0 




4.0 




C12-C15AE2.85S 






3.0 


18.0 




16.0 


C14-C15E 2.5 s 


11.5 


11.5 


3.0 




16.0 




C 12-C13E9 






3.0 


2.0 


2.0 


1.0 


C 12-C13E 7 


3.2 


3.2 










CFAA 








5.0 




3.0 


TPKFA 


2.0 


2.0 




2.0 


0.5 


2.0 


Citric Acid 


3.2 


3.2 


0.5 


1.2 


2.0 


1.2 


(Anhydrous) 














Ca formate 


0.1 


0.1 


0.06 


0.1 






Na formate 


0.5 


0.5 


0.06 


0.1 


0.05 


0.05 


Na Culmene 


4.0 


4.0 


1.0 


3.0 


1.2 




Sulfonate 














Borate 


0.6 


0.6 




3.0 


2.0 


3.0 


Na Hydroxide 


6.0 


6.0 


2.0 


3.5 


4.0 


3.0 


Ethanol 


2.0 


2.0 


1.0 


4.0 


4.0 


3.0 


1 ,2 Propanediol 


3.0 


3.0 


2.0 


8.0 


8.0 


5.0 


Mono- 


3.0 


3.0 


1.5 


1.0 


2.5 


1.0 


ethanolamine 














TEPAE 


2.0 


2.0 




1.0 


1.0 


1.0 


ASP 


0.03 


0.05 


0.01 


0.03 


0.08 


0.02 


Protease A 






0.01 








Lipase 








0.002 






Amylase 










0.002 




Cellulase 












o.ooc 


Pectin Lyase 


0.005 


0.005 










Aldose Oxidase 


0.05 






0.05 




0.02 


Galactose oxidase 




0.04 
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Table 36-6. Liquid Laundry Detergent Formulations 



Component I I II III IV V 

PAAC 0.03 0.03 0.02 

DETBCHD - - - 0.02 0.01 

SRP1 0.2 0.2 - 0.1 

DTPA - 0.3 

PVNO - - - 0.3 - 0.2 

Brightenerl 0.2 0.2 0.07 0.1 



Silicone antifoam 0.04 0.04 0.02 0.1 0.1 0.1 

Balance to 100% perfume/dye and/or water 



Table 36-7 provides compact high density dishwashing detergents that are prepared. 



Table 36-7. Compact High Density Dishwashing Detergents 



Component 


I 


II 


III 


IV 


V 


VI 


STPP 




45.0 


45.0 






40.0 


3Na Citrate 2H a O 


17.0 






50.0 


40.2 




Na Carbonate 


17.5 


14.0 


20.0 




8.0 


33.6 


Bicarbonate 








26.0 






Silicate 


15.0 


15.0 


8.0 




25.0 


3.6 


Metasilicate 


2.5 


4.5 


4.5 








PB1 






4.5 








PB4 








5.0 






Percarbonate 












4.8 


BB1 




0.1 


0.1 




0.5 




BB2 


0.2 


0.05 




0.1 




0.6 


Nonionic 


2.0 


1.5 


1.5 


3.0 


1.9 


5.9 


HEDP 


1.0 












DETPMP 


0.6 












PAAC 


0.03 


0.05 


0.02 








Paraffin 


0.5 


0.4 


0.4 


0.6 






ASP 


0.072 


0.053 


0.053 


0.026 


0.059 


0.01 


Protease B 












0.01 


Amylase 


0.012 




0.012 




0.021 


0.006 


Lipase 




0.001 




0.005 






Pectin Lyase 


0.001 


0.001 


0.001 








Aldose Oxidase 


0.05 


0.05 


0.03 


0.01 


0.02 


0.01 


BTA 


0.3 


0.2 


0.2 


0.3 


0.3 


0.3 


Polycarboxylate 


6.0 








4.0 


0.9 


Perfume 


0.2 


0.1 


0.1 


0.2 


0.2 


0.2 



Balance to 100% Moisture and/or Minors* 

*Brightener / Dye / SRP1 / Na Carboxymethylcellulose/ Photobleach / MgS0 4 / PVPVI/ Suds 
suppressor /High Molecular PEG/Clay. 

The pH of the above compositions is from about 9.6 to about 1 1 .3, 
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Table 36-8 provides tablet detergent compositions of the present invention that are 
prepared by compression of a granular dishwashing detergent composition at a pressure of 
13KN/cm 2 using a standard 12 head rotary press: 



Table 36-8. Tablet Detergent Compositions 



Component 


1 


II 


III 


IV 


V 


VI 


VII 


VIII 


STPP 


- 


48.8 


44.7 


38.2 


- 


42.4 


46.1 


36.0 


3Na Citrate 2H 2 0 


20.0 


- 


- 


- 


35.9 


- 


- 


- 


Na Carbonate 


20.0 


5.0 


14.0 


15.4 


8.0 


23.0 


20.0 


28.0 


Silicate 


15.0 


14.8 


15.0 


12.6 


23.4 


2.9 


4.3 


4.2 


Lipase 


0.001 


- 


0.01 


- 


0.02 


- 


- 


- 


Protease B 


0.01 


- 


- 


- 


- 


- 


- 


- 


Protease C 


- 


- 


- 


- 


- 


0.01 


- 


- 


ASP 


0.01 


0.08 


0.05 


0.04 


0.052 


0.023 


0.023 


0.029 


Amylase 


0.012 


0.012 


0.012 


- 


0.015 


- 


0.017 


0.002 


Pectin Lyase 


0.005 






0.002 


- 


_ 


m 


- 


Aldose Oxidase 


- 


0.03 


- 


0.02 


0.02 


- 


0.03 


- 


PB1 






3.8 


- 


7.8 


_ 


- 


8.5 


Percarbonate 


6.0 






6.0 


_ 


5.0 


_ 


_ 


BB1 


0.2 




0.5 




0.3 


0.2 






BB2 




0.2 




0.5 






0.1 


0.2 


Nonionic 


1.5 


2.0 


2.0 


2.2 


1.0 


4.2 


4.0 


6.5 


PAAC 


0.01 


0.01 


0.02 












DETBCHD 








0.02 


0.02 








TAED 












2.1 




1.6 


HEDP 


1.0 






0.9 




0.4 


0.2 




DETPMP 


0.7 
















Paraffin 


0.4 


0.5 


0.5 


0.5 






0.5 




BTA 


0.2 


0.3 


0.3 


0.3 


0.3 


0.3 


0.3 




Polycarboxylate 


4.0 








4.9 


0.6 


0.8 




PEG 400-30,000 












2.0 




2.0 


Glycerol 












0.4 




0.5 


Perfume 








0.05 


0.2 


0.2 


0.2 


0.2 


Balance to 100% 


Moisture 


and/or Minors* 













*Brightener / SRP1 / Na Carboxymethylcellulose/ Photobleach / MgS0 4 / PVPVI/ Suds 

suppressor /High Molecular PEG/Clay. 

The pH of these compositions is from about 10 to about 1 1.5. 

The tablet weight of these compositions is from about 20 grams to about 30 grams. 

Table 36-9 provides liquid hard surface cleaning detergent compositions of the 
present invention that are prepared. 



Table 36-9. Liquid Hard Surface Cleaning Detergent Compositions 
Component I II III IV V VI VII 
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Table 36-9. Liquid Hard Surface Cleaning Detergent Compositions 



Component 


I 


II 


III 


IV 


V 


VI 


VII 


Cg-Ci 1 E5 


2.4 


1.9 


2.5 


2.5 


2.5 


2.4 


2.5 


Ci2"Ci4Es 


3.6 


2.9 


2.5 


2.5 


2.5 


3.6 


2.5 


C7-C9E5 










8.0 






Ci2"C^4E2i 


1.0 


0.8 


4.0 


2.0 


2.0 


1.0 


2.0 


LAS 








0.8 


0.8 




0.8 


Sodium culmene sulfonate 


1.5 


2.6 




1.5 


1.5 


1.5 


1.5 


I sachem ® AS 


0.6 


0.6 








0.6 




Na 2 C0 3 


0.6 


0.13 


0.6 


0.1 


0.2 


0.6 


0.2 


3Na Citrate 2H z O 


0.5 


0.56 


0.5 


0.6 


0.75 


0.5 


0.75 


NaOH 


0.3 


0.33 


0.3 


0.3 


0.5 


0.3 


0.5 


Fatty Acid 


0.6 


0.13 


0.6 


0.1 


0.4 


0.6 


0.4 


2-butyl octanol 


0.3 


0.3 




0.3 


0.3 


0.3 


0.3 


PEG DME-2000® 


0.4 




0.3 


0.35 


0.5 






PVP 


0.3 


0.4 


0.6 


0.3 


0.5 






MME PEG (2000) ® 












0.5 


0.5 


Jeffamine ® ED-2001 




0.4 






0.5 






PAAC 








0.03 


0.03 


0.03 




DETBCHD 


0.03 


0.05 


0.05 










/AO I 




n 


n fl8 


n n*} 

u.uu 


Ct OR 


n 01 


n (\a 


Protease B 












0.01 




Amylase 


0.12 


0.01 


0.01 




0.02 




0.01 


.Lipase 




0.001 




0.005 




0.005 




Pectin Lyase 


0.001 




0.001 








0.002 


PB1 




4.6 




3.8 








Aldose Oxidase 


0.05 




0.03 




0.02 


0.02 


0.05 



Balance to 100% perfume / dye and/or water 

The pH of these compositions is from about 7.4 to about 9.5. 



5 



EXAMPLE 37 
Animal Feed Comprising ASP 

The present invention also provides animal feed compositions 



comprising ASP 
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and/pr ASP variants. In this Example, one such feed, suitable for poultry is provided. 
However, it is not intended that the present invention be limited to this specific formulation, 
as the proteases of the present invention find use with numerous other feed formulations. It 
is further intended that the feeds of the present invention be suitable for administration to 
any animal, including but not limited to livestock (e.g., cattle, pigs, sheep, etc.), as well as 
companion animals (e.g., dogs, cats, horses, rodents, etc.). The following Table provides a 
formulation for a mash, namely a maize-based starter feed suitable for administration to 
turkey poults up to 3 weeks of age. 



Table 37-1. Animal Feed Composition 


Ingredient Amount 


(wt. %) 


Maize 


36.65 


Soybean meal (45.6% CP) 


55.4 


Animal-vegetable fat 


3.2 


Dicalcium phosphate 


2.3 


Limestone 


1.5 


Mineral premix 


0.3 


Vitamin premix 


0.3 


Sodium chloride 


0.15 


DL methionine 


0.2 



In some embodiments, this feed formulation is supplemented with various 
concentrations of the protease(s) of the present invention (e.g., 2,000 units/kg, 4,000 
units/kg and 6,000 units/kg). 



All patents and publications mentioned in the specification are indicative of the levels 
of those skilled in the art to which the invention pertains. All patents and publications are 
herein incorporated by reference to the same extent as if each individual publication was 
specifically and individually indicated to be incorporated by reference. However, the citation 
of any publication is not to be construed as an admission that it is prior art with respect to 
the present invention. 

Having described the preferred embodiments of the present invention, it will appear 
to those ordinarily skilled in the art that various modifications may be made to the disclosed 
embodiments, and that such modifications are intended to be within the scope of the 
present invention. 
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Those of skill in the art readily appreciate that the present invention is well adapted 
to carry out the objects and obtain the ends and advantages mentioned, as well as those 
inherent therein. The compositions and methods described herein are representative of 
preferred embodiments, are exemplary, and are not intended as limitations on the scope of 
the invention. It is readily apparent to one skilled in the art that varying substitutions and 
modifications may be made to the invention disclosed herein without departing from the 
scope and spirit of the invention. 

The invention illustratively described herein suitably may be practiced in the absence 
of any element or elements, limitation or limitations which is not specifically disclosed herein. 
The terms and expressions which have been employed are used as terms of description 
and not of limitation, and there is no intention that in the use of such terms and expressions 
of excluding any equivalents of the features shown and described or portions thereof, but it 
is recognized that various modifications are possible within the scope of the invention 
claimed. Thus, it should be understood that although the present invention has been 
specifically disclosed by preferred embodiments and optional features, modification and 
variation of the concepts herein disclosed may be resorted to by those skilled in the art, and 
that such modifications and variations are considered to be within the scope of this invention 
as defined by the appended claims. 

The invention has been described broadly and generically herein. Each of the 
narrower species and subgeneric groupings falling within the generic disclosure also form 
part of the invention. This includes the generic description of the invention with a proviso or 
negative limitation removing any subject matter from the genus, regardless of whether or not 
the excised material is specifically recited herein. 



